You are viewing content from a past/completed QCon -

Presentation: Scaling Patterns for Netflix's Edge

Track: Architectures You've Always Wondered About

Location: Grand Ballroom ABC

Duration: 10:35am - 11:25am

Day of week:

Slides: Download Slides

This presentation is now available to view on

Watch video with transcript

What You’ll Learn

  1. Hear about Netflix’ scalability issues and some of the ways they addressed it.
  2. Learn how splitting a service into two can help with performance and consequently with scalability.


In 2008 Netflix had less than a million streaming members. Today we have over 150 million. That explosive growth in membership has led to a similar growth in the number of microservices, in the amount of cloud resources, and our overall architectural complexity. Eventually, that sheer number of computation resources becomes hard to manage and sacrifices our reliability. At Netflix, we’ve found a few techniques that have helped keep our computation growth manageable and reliable.

There are the obvious tasks of performance tuning, reducing features, or reducing data. Going beyond just “tightening the belt” tactics, we had to rethink how we handle every request. At our scale, we can no longer call a customer database on every request, we can no longer fan out to a cascade of mid-tier requests on every request, and we can no longer log every request, so we don’t. This session will introduce the architectural patterns we’ve adopted to accomplish skipping those steps, which would normally be considered required for a functioning system.

I will also be sharing successes we’ve had from unintuitively partitioning computation into multiple services to get better runtime characteristics. Through this session, you will be introduced to useful probabilistic data structures, innovative bi-directional data passing, and open-source projects available from Netflix that make this all possible.


What is the work you're doing today?


I'm currently focused on Functions as a Service (FaaS) at Netflix, and how developers at Netflix can most leverage it to their advantage. I previously was working on, up until very recently, was our Edge Authentication team. This is where we authenticate users and devices right at our Edge layer. It was a team that we formed to help deal with the problem space and to really be focused on the complexities that can come with it. It's ultimately about the users' experience being tied to staying logged in and so having a team focused on it meant we could have more users stay logged in, which meant happier users.


What are your goals for the talk?


I think there are a lot of scaling patterns that people think are inaccessible to them. They might sound fancy or seem specific to complex database servers, but we found that digging into a few of them and spending a couple of days on it made them pretty attainable. My goal here is to share some of the patterns that we do that we know are successful, and we think other people could also be doing. Likewise, the kind of problems seen when architecting is very difficult to put into a library. It's not like I can just ship something up to Maven Central. These are architectural patterns, so it's something I have to describe to someone else and then architects have to adapt it to their platform. So I really need to have that conversation with people for them to really learn these.


In the abstract, you talk about unintuitively partitioning computation into multiple services to get better runtime characteristics. Can you expand on what you mean by that?


I call this the 1+1=3 problem. Quite a few of our servers still work like monoliths and there's a lot of things going on in their runtime and that inherent complexity causes issues for the JVM. By pulling apart a complex runtime into separate services, we can keep the runtime as simple as possible. The JVM can then do amazing things, there are a few cases where we just pulled out a libraries to another service and we saw drops in garbage collection by 20-30%. That performance gain was just by moving code around. The resulting deployment only needed 1 instance plus another instance with the extracted library, to do the same work that you were doing previously with 3. Hence 1+1=3. The simplicity lets everybody run less instances, less cost, better 99 latency, just better, better, better across the board.


What's driving that?


Sometimes workloads can be conflated, in that you might have your primary logic in Groovy scripts and some security code that's CPU bound that accidentally leaks long living objects. Both scenarios require different GCs characteristics. Yet, you have to pick one Garbage Collector, it's hard to optimize for both. By breaking them up, you can start to tune for those runtimes individually. You can tune maybe more inlining for one scenario and tune thread size for the other. When they get mixed together you've lost those capabilities for tuning.


What would you want people to leave the talk with?


Attendees will leave with the tools needed to implement one or more of the patterns we used at Netflix. There are a few patterns where there is a library that attendees can use to implement the pattern themselves. Some patterns serve as cost saving mechanisms. That's not necessary why Netflix necessarily used them, but attendees could see the patterns as opportunities to save money by implementing one of these algorithms or patterns over the course of maybe a couple of days. One or two engineers over the course of a week probably implement most of the patterns. Ideally, attendees will see the patterns as accessible and excited to try them.

Speaker: Justin Ryan

Playback Edge Engineering @Netflix

Justin Ryan started writing code on a Commodore 64 and hasn’t stopped since then. As part of Netflix’s Playback Edge Engineering, he works on some of the most critical services at Netflix, specifically focusing on user and device authentication. Years of building developer tools has also given him a healthy set of opinions on developer productivity.

Find Justin Ryan at

Last Year's Tracks

  • Monday, 16 November

  • Inclusion & Diversity in Tech

    The road map to an inclusive and diverse tech organization. *Diversity & Inclusion defined as the inclusion of all individuals in an within tech, regardless of gender, religion, ethnicity, race, age, sexual orientation, and physical or mental fitness.

  • Architectures You've Always Wondered About

    How do they do it? In QCon's marquee Architectures track, we learn what it takes to operate at large scale from well-known names in our industry. You will take away hard-earned architectural lessons on scalability, reliability, throughput, and performance.

  • Architecting for Confidence: Building Resilient Systems

    Your system will fail. Build systems with the confidence to know when they do and you won’t.

  • Remotely Productive: Remote Teams & Software

    More and more companies are moving to remote work. How do you build, work on, and lead teams remotely?

  • Operating Microservices

    Building and operating distributed systems is hard, and microservices are no different. Learn strategies for not just building a service but operating them at scale.

  • Distributed Systems for Developers

    Computer science in practice. An applied track that fuses together the human side of computer science with the technical choices that are made along the way

  • Tuesday, 17 November

  • The Future of APIs

    Web-based API continue to evolve. The track provides the what, how, and why of future APIs, including GraphQL, Backend for Frontend, gRPC, & ReST

  • Resurgence of Functional Programming

    What was once a paradigm shift in how we thought of programming languages is now main stream in nearly all modern languages. Hear how software shops are infusing concepts like pure functions and immutablity into their architectures and design choices.

  • Social Responsibility: Implications of Building Modern Software

    Software has an ever increasing impact on individuals and society. Understanding these implications helps build software that works for all users

  • Non-Technical Skills for Technical Folks

    To be an effective engineer, requires more than great coding skills. Learn the subtle arts of the tech lead, including empathy, communication, and organization.

  • Clientside: From WASM to Browser Applications

    Dive into some of the technologies that can be leveraged to ultimately deliver a more impactful interaction between the user and client.

  • Languages of Infra

    More than just Infrastructure as a Service, today we have libraries, languages, and platforms that help us define our infra. Languages of Infra explore languages and libraries being used today to build modern cloud native architectures.

  • Wednesday, 18 November

  • Mechanical Sympathy: The Software/Hardware Divide

    Understanding the Hardware Makes You a Better Developer

  • Paths to Production: Deployment Pipelines as a Competitive Advantage

    Deployment pipelines allow us to push to production at ever increasing volume. Paths to production looks at how some of software's most well known shops continuous deliver code.

  • Java, The Platform

    Mobile, Micro, Modular: The platform continues to evolve and change. Discover how the platform continues to drive us forward.

  • Security for Engineers

    How to build secure, yet usable, systems from the engineer's perspective.

  • Modern Data Engineering

    The innovations necessary to build towards a fully automated decentralized data warehouse.

  • Machine Learning for the Software Engineer

    AI and machine learning are more approachable than ever. Discover how ML, deep learning, and other modern approaches are being used in practice by Software Engineers.