Presentation: Scaling Dropbox

Duration

Duration: 
4:10pm - 5:00pm

Level:

Persona:

Key Takeaways

  • Learn the challenges Dropbox faces with reads and writes, and how they address them.
  • Understand the key design principle behind the architecture of a large storage application.
  • Learn lessons and tips addressing massive scale, cascading failure, architecture, MySql, MemCached, and more.

Abstract

Dropbox is a technology company that builds simple, powerful products for people and businesses. We’ve grown enormously since launching in 2008, surpassing 500 million signups and storing over 500 petabytes of user data. Since we started, Dropbox users have created 3.3 billion connections by sharing with each other and are saving 1.2 billion files per day. In this talk, we’ll discuss how Dropbox’s infrastructure evolved over the years and how it looks today as well the challenges and lessons learned on the way.

Interview

Question: 
QCon: What is your role today?
Answer: 

Preslav: I’m software engineer on the storage team at Dropbox. In the storage team, we’re responsible for storing files in Dropbox (which is critical for us). Durability is very important for Dropbox because it’s a service for backing up your files. It’s a very different problem. That’s why we’re so concerned about durability. We have built internal software (Magic Pocket) with that in mind.

Before joining this team, I spend a year in infrastructure performance.

Question: 
QCon: You talk about scale of storing files at Dropbox- that’s got to be massive. What is the type of scale you are talking about?
Answer: 

Preslav: We store more than 500 petabytes.

Question: 
QCon: Could you tell me about Magic Pocket?
Answer: 

Preslav: Some parts of it are literally a replacement for S3. Because we use it for internal purposes, we don’t need to build everything that S3 offers. Magic Pocket gives us the flexibility to tune and optimize for our use cases at Dropbox.

Question: 
QCon: Can you explain your talk to me?
Answer: 

Preslav:I’ll be focusing on the whole Dropbox architecture. There is already a good talk on the internet from early 2012 for how Dropbox scaled as startup from 0 to 50 million users so I will focus on how we scaled as a more mature company from 50 to 500 million. What’s unique about Dropbox's architecture. What are the scaling issues we faced and how we solved them.

Unlike many companies, we really care about consistency. If you delete a file in your folder, we need to have those in strict order. Another thing that is different for Dropbox is we serve a lot of writes. Most companies- if you think of YouTube, Facebook- they serve some number of writes. But really it is 10 or 100 or 1000 more reads than writes. For us, reads and writes are roughly equal. So that makes our scaling challenges a bit different than you might have heard about before.

Question: 
QCon: Do you dive into some of the approaches you are using to handle consistency?
Answer: 

Preslav: One example is with our memcache (we modified the library for our use case). Usually you use a memcache by just reading from it. If the cache is not available, you read from the database. But for Dropbox, memecache is also on the write path. Everything you need to write to the database, you need to write to memcache in order to keep it strictly consistent. I’ll tell you why this is a problem: when your memcache is not available, then your database is also not available.

Question: 
QCon: Can you give an example of one of the ways you dealt with some of these challenges?
Answer: 

Preslav: Because of the high number of writes, we needed to introduce sharding much earlier than other companies. While sharding is simple concept in theory, in practice it is a little bit more challenging. We’ll talk about historical things that didn’t go perfectly, like shard isolation and collocation.

Then, we will discuss why couldn’t scale as organization using mysql directly and how we design systems in Dropbox today knowing failure is inevitable.

Speaker: Preslav Le

Software Engineer @Dropbox

Preslav is a software engineer at Dropbox for the past 3 years, contributing to various aspects of Dropbox’s infrastructure including traffic, performance and storage. He was part of the core oncall and storage oncall rotations, dealing with high emergency real world issues from bad code pushes to complete datacenter outages. Prior to Dropbox, he worked in VMware and a travel technology startup called Everbread. In his early career, he was interested in computer graphics and building games but later found there are challenging problems everywhere, especially in the clouds.

Find Preslav Le at

Similar Talks

Developer @ThoughtWorks Inc
Tech Lead of Manhattan Team @Twitter
Staff Engineer, JVM Team @Twitter
Technical Manager Aurora / Mesos Team @Twitter
Provisioning Engineering SE @Twitter
Senior Software Engineer, Playback Features @Netflix

.

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers