Conference: Nov 5-7, 2018
Workshops: Nov 8–9, 2018
Presentation: The Evolution of Reddit.com's Architecture
Share this on:
What You’ll Learn
- Learn how Reddit is breaking down the monolith and moving to services
- Hear about lessons learnt with their architecture evolution, and about the surprising trade-offs that weren’t just technology-focused
- Gain insights into how to deal with ‘what’s next’ when experiencing immense growth
Abstract
A stroll through the history of the systems that power reddit.com, looking at things that worked, things that didn't, and where we're going next.
Interview
We're doing a mishmash of things right now. At the core of it is building an environment that allows the product team at Reddit to act. That is, stuff that the users want to see, and is something they can feel confident about shipping—and to let them do that as quickly as possible, while balancing that with performance.
We have a couple of main supported archetypes. We've got the frontend stack which is node based, and the backend side that is primarily in Python right now. We have a shared library for all of those services to use, and that brings with it things like automatic tracing and metrics collection, logging etc. Underlying all of that is a set of Puppet modules that's shared across all of this which does log shipping, metrics collection and so on. It reduces the amount of work that each person has to do when launching a new service.
This is all pretty new. One of the weird things about Reddit is that for probably 10 of the 12 years it's existed, the entire engineering work has been between five and 10 people. In the last two years have we really started growing, so we’ve started needing to get rid of the monolith and figure out how to split up into services. We also have a focus on how we deal with all these new people.
It's entirely an AWS. And right now we are not using any containers in production. So it's all just standard instances. The deployment workloads can basically be summed up as a nice automated FOR loop.
An early step was the jump from data center to the cloud. There's some interesting stuff there in terms of how the network latency changed drastically, how the architecture had to work and so on. One of the major components of Reddit is listings on the site, like a sub-Reddit or commentaries. They have gone through a ton of iterations in how those are stored and eventually pre-computed and fetched on the fly. Also dealing with the huge amounts of data we have.
The other major thing is the last couple of years we have grown immensely, so figuring out how to deal with a bunch of new people, starting to split up the monolith, starting to figure out services, what it looks like and what we're doing with services and how to create autonomy for the teams in the company.
Way back the first version of Reddit was in Lisp, and that was rewritten about a year in into Python and now R2 is the current monolith. That is being split up into Python Microservices - well, we're just calling them services because we're not sure but the micro part.
That it's never done. Don't let the perfect be the enemy of the good. And know that lot of the trade offs involved are actually human, not technical.
Basically anybody who is going through the growth curve and trying to figure out what do we do next. Both with traffic scaling and people scaling.
Some things are really obvious in retrospect, like coordination for example becomes really important, but I think it's also that there's certain things such getting designs more thought out upfront becomes way more important because they become harder to change when you've got different teams on other sides of boundaries. Communication is a big part.
How to get autonomy to people in the organization, and get them to feel empowered and able to ship things as quickly as make sense—without compromising the security and stability of the site in the process.
Similar Talks
.
Tracks
-
Architectures You've Always Wondered About
Architectural practices from the world's most well-known properties, featuring startups, massive scale, evolving architectures, and software tools used by nearly all of us.
-
Going Serverless
Learn about the state of Serverless & how to successfully leverage it! Lessons learned in the track hit on security, scalability, IoT, and offer warnings to watch out for.
-
Microservices: Patterns and Practices
Stories of success and failure building modern Microservices, including event sourcing, reactive, decomposition, & more.
-
DevOps: You Build It, You Run It
Pushing DevOps beyond adoption into cultural change. Hear about designing resilience, managing alerting, CI/CD lessons, & security. Features lessons from open source, Linkedin, Netflix, Financial Times, & more.
-
The Art of Chaos Engineering
Failure is going to happen - Are you ready? Chaos engineering is an emerging discipline - What is the state of the art?
-
The Whole Engineer
Success as an engineer is more than writing code. Hear inward looking thoughts on inclusion, attitude, leadership, remote working, and not becoming the brilliant jerk.
-
Evolving Java
Java continues to evolve & change. Track covers Spring 5, async, Kotlin, serverless, the 6-month cadence plans, & AI/ML use cases.
-
Security: Attacking and Defending
Offense and defensive security evolution that application developers should know about including SGX Enclaves, effects of AI, software exploitation techniques, & crowd defense
-
The Practice & Frontiers of AI
Learn about machine learning in practice and on the horizon. Learn about ML at Quora, Uber's Michelangelo, ML workflow with Netflix Meson and topics on Bots, Conversational interfaces, automation, and deployment practices in the space.
-
21st Century Languages
Compile to Native, Microservices, Machine learning... tailor-made languages solving modern challenges, featuring use cases around Go, Rust, C#, and Elm.
-
Modern CS in the Real World
Applied trends in Computer Science that are likely to affect Software Engineers today. Topics include category theory, crypto, CRDT's, logic-based automated reasoning, and more.
-
Stream Processing In The Modern Age
Compelling applications of stream processing using Flink, Beam, Spark, Strymon & recent advances in the field, including Custom Windowing, Stateful Streaming, SQL over Streams.
-
Performance Mythbusting
Real world, applied performance proofs across stacks. Hear performance consideratiosn for .NET, Python, & Java. Learn performance use cases with OpenJ9, Instagram, and Netflix.
-
Tools and Culture: What's Beyond a Stack of Containers?
Containers are not just a techology. It's a platform. Push your knowledge.
-
Web as Platform
All things Browser, from JavaScript Frameworks for animation and AR / VR to Web Assembly and from protocol work to open standards evolution.
-
Beyond Being an Individual Contributor
Beyond being an individual contributor. Building and Evolving managers and tech leadership.
-
Building Great Engineering Cultures
Why engineering culture matters. Track features org scaling, memes as a culture tool, Ally skills, and panels on diversity / inclusion.
-
Hardware Frontiers: Changes Affecting Software Developers Today
Topics around: Quantum computing, NVM, SMR, GPU, custom hardware, self-driving cars, and mobile hardware.