Introduction:
In 2014, Instagram was at a crossroads. They had a massive user base but were burning cash on AWS S3 storage. With over 20 billion photos and a growth rate that was terrifying, the engineering team decided to do the impossible: Move everything to Facebook’s data centers without the world noticing.
The Engineering Powerhouse:
You might think it took thousands of people, but the core migration was handled by a surprisingly small and elite team. Only about 8 to 10 core engineers were responsible for the architecture of this migration. They had to ensure that while they moved petabytes of data, the 200 million active users could still post their lunch photos without a single error message.
The Strategy: 0 Downtime or Bust
To achieve Zero Downtime, the team used a method called “The Dark Launch”:
Dual-Writing: Every time a user uploaded a photo, it was written to both Amazon S3 and Instagram’s new servers at the same time.
Verification: They built a “Verifier” daemon that checked every single file’s integrity using MD5 checksums. If even one pixel was different, the migration tool would retry.
The Cutover: Once all 20 billion old photos were copied, they simply flipped the “Read” switch from AWS to their own servers.
Why did they do it? (The 3 Big Pillars)
Cost Saving: Cloud storage for 20 billion photos is insanely expensive. Moving to internal servers saved them millions of dollars every single month.
Total Control: On AWS, they were limited by Amazon’s infrastructure. In their own data centers, they could optimize the hardware specifically for photo-heavy traffic.
Latency: By moving to the same network as Facebook, the speed of data transfer improved significantly, making the app feel “snappier” for users.
Conclusion:
Moving 20 billion photos is like moving a skyscraper while people are still living in it. It remains one of the greatest feats in the history of System Design.
Visit our site for daily system Design stories
I am Nadeem, System lover, My answeres on Quora got 500k views in just 45 days, got upvotes from Senior Microsoft Employees
I promise that I will give daily deep system facts on DSS (DeepSystemStuff.com)
Read my another artilce here : Does linux have main Event Loop?
Leave a Reply