Does Instagram use AWS? the surprising truth about its Infrastructure

Introduction:


Instagram was purchased by Meta in 1012 in 1 Billion dollars.

When Instagram was a startup, they used AWS for compute and storage

AWS S3 were there main engine that stores images, and they introduced video posting in 2013.

S3 provided them super speedy development, because they did not need to write storage apis and did not need to invest thousands of dollars in infrastructure, because AWS provided them pay only for the infrastructure power and storage used.

Instagram became massively famous, and its user base grew rapidly.

But when it came under Meta’s umbrella, then the Meta team noticed that it is burning cash when Meta has their own infra which is super cheap compared to AWS.

Why did Meta decide to change the hosting of Instagram when millions of users were using it? Was it a risky move?

Meta is a very big company. When Facebook was exploding in terms of users, Mark Zuckerbeg though they needed to make their own infrastructure, otherwise they would burn through their cash, and it would be super risky in the future.

AWS Compute and S3 were both burning cash, and this was one of the main factors Meta wanted to change infra of Instagram

Another important factor was control over Data; Meta hosts everything in their own data centers, so they thought Instagram should also use the same strategy.

Latency was also a big reason. Meta’s own infrastructure was fast because AWS was operating for thousands of customers, but Meta is serving itself, so the latency and speed of their own infrastructure were huge

Preparation of Migration:

Migration of data of millions of users, and a complex codebase, this was a critical migration because millions of users should not get any interruptions, because users may loose the trust.

This massive task looked like a legend, but Meta has to deliver it along with the Instagram team.

If you think of team size, you would think at least hundreds of Engineers might have delivered it.

But the real size of the team was only 8 to 10 Engineers were in the team of Migration team.

Instagram had 200 million users at that time, and was having petabytes of data.

Not only this, users were still posting huge amounts of data every day, every hou,r and every minute.

Team needed to make sure 0 interruption for any users around the world

Migration

Meta wanted a 0 Downtime, and this was very ambitious expectations

Dual Write Approach:

Meta used a dual write approach in migration.

In this Dual write approach, Meta wrote a code when users upload any phot,o then it will be stored in AWS servers and Meta’s Servers which will be Instagram’s new Servers.

The dual write approach is a very famous and reliable approach in migrations. 

In this approach, the legacy servers and new servers will get the latest data

Advantages of the Dual Write Approach

  1. 0 downtime 
  2. Real production data test: this is real production data where you are writing data to new servers, you can check that new data APIs are working fine, new storage is working fine, and your infrastructure is scalable and fast enough.
  3. Safe Rollback: old systems are fin,e but what about new systems, APIs, Storage, Servers? If something is not working,g then Rollback is smooth inthe  Dual Write Approach
  4. Control: you can’t control the flow of data of new Infra, first you can test 20% server,s and if everything is fine, then you can keep incrementing that percentage
  5. Parallel Development: The team can keep developing on new features, migration is not a blocker for them.
  6. Data Safety: If the new System is failing, then you still have data on your old system.

 

The team that was doing migration contained engineers from Meta and Instagram

During this migration, they created some tools for which Meta did not provide any information, but we have some information still

They had created a verifier daemon for verification of a file’s integrity using MD5 Checksum if a single pixel differs,t then this tool will retry writing.

 

In this process, Meta migrated 20 billion photos to their new servers, which was massive

 

I promise that I will give daily deep system facts on DSS (DeepSystemStuff.com)

🚀 Deep Interesting System Facts, Join our WhatsApp Channel to get daily updates: Join Now

 

 

Leave a Comment