Adopting Continuous Deployment at Lyft

All organizations, regardless of size, need to be able to make rapid changes and improvements in their constantly growing systems. How can we handle all this change while maintaining a reliable product? 

In 2018, Lyft operated a few hundred services. Deploying a change was difficult: a developer had to take a lock, read a runbook, then execute several manual steps, all while monitoring for potential problems. This took time, resulting in large, delayed, deploy trains, and ultimately, reliability issues. Today, Lyft operates over 1,000 services, and, by adopting continuous deployment, more than 90% are now automatically deployed to production, with no manual intervention. This has significantly improved reliability, freed up developer time, and sped up our ability to ship changes. 

I will share the details on our journey to continuous deployment, the benefits, challenges, and lessons we learned along the way:

  1. The benefits, obvious and maybe non-obvious, of continuous deployment.
  2. How to set up your organization’s deploy culture to successfully adopt continuous deployment.
  3. How to design a pluggable system of checks to automatically detect issues in deployments before they become widespread.
  4. How we measured to ensure we improved reliability and developer productivity through continuous deployment.

What is the focus of your work these days?

I've been at Lyft for almost four years now, and my day-to-day responsibilities since I joined were overseeing the infrastructure, more on the production aspect of it, more about the side after the code has been unit tested and built and then ready to go to production. Mainly focused a lot on deployment as you'll see in this talk. But now I oversee more than just the deployment aspect, but also the networking, how the processes work, how we manage our Kubernetes clusters, things like that.

And what's the motivation for your talk?

 I feel that the topic of continuous deployment or continuous delivery, this is the apex of automation, right? The idea seems a little scary, but it also seems fantastic. It almost seems like a fantasy world that no one can really get to. You can ship code change at any time and it can safely go to production. You could even deploy on a Friday. That just sounds like a fantasy world. And I think that the motivation here is we haven't really gotten to that. True fantasy world? We're not having unicorns and stuff floating around. But we're getting really close. And the purpose of this talk is to show people that it's not really an impossible goal. And it's only possible, especially if you cut the right corners that are appropriate for your organization and things like that. So, yeah, I'm here to tell you that it is possible.

How would you describe the persona and level of your target audience?

The persona for this talk is someone who either is in a decision making position or wants to be in a decision making position to create a very large cultural shift in your organization. If you adopt continuous deployment or continuous delivery, it tends to change the shipping culture of the entire organization. And it's something that we've seen and something that I'll mention in this talk as well. So if you want to be someone who can usher in that big phase change in your organization, I think that's the type of person this is for.

You've touched on this a little bit, but what is it that you would like the people that go to your session to walk away with?

What I said earlier is that this is possible. It can seem like an impossible task. I know some larger organizations definitely have some aspects of continuous deployment, but for organizations that are in the teenager phase where one part of the organization has grown, but the other one hasn't grown appropriately with that. And therefore, maybe your infrastructure team's a little weaker than the rest. Even in those cases, you can build a tool to make it possible to do continuous deployment because we've been in that phase where we're just growing unevenly and even in that situation we've been able to ship something that has helped tremendously with our operations.


Speaker

Tom Wanielista

Senior Staff Software Engineer @Lyft

Tom Wanielista is part of the Infrastructure team at Lyft, where he has focused on improving reliability in production by speeding up the deployment feedback loop. Prior to Lyft, Tom worked on Infrastructure in the Fintech space, where he was responsible for building tools to allow developers to safely deploy changes while keeping the stack secure & compliant. Tom studied at New York University where he received a BA in Computer Science.

Read more
Find Tom Wanielista at:

Date

Monday Oct 24 / 10:35AM PDT ( 50 minutes )

Location

Ballroom A

Topics

Architecture Continuous Deployment Deploy Culture Deployment Issues Reliability Developer Productivity DevOps

Share

From the same track

Session Architecture

Dark Side of DevOps

Monday Oct 24 / 02:55PM PDT

Topics like “you build it, you run it” and “shifting testing/security/data governance left” are popular: moving things to the earlier stages of software development, empowering engineers, shifting control definitely sounds good.

Speaker image - Mykyta Protsenko
Mykyta Protsenko

Senior Software Engineer @Netflix

Session Architecture

Stress Free Change Validation at Netflix

Monday Oct 24 / 04:10PM PDT

How do you gain confidence that a system modification does what it’s supposed to do? A refactoring should not cause a functional change, whereas a feature modification should cause a specific kind of change.

Speaker image - Javier Fernandez-Ivern
Javier Fernandez-Ivern

Staff Software Engineer @Netflix with over 20 years in Software Engineering

Session Architecture

Log4Shell Response Patterns & Learnings From Them

Monday Oct 24 / 05:25PM PDT

In early December 2021, rumors about a remote code execution (RCE) vulnerability in Log4j began circulating on social media, dubbed Log4Shell. Over the next three days, those rumors were confirmed and the immense scope of the vulnerability became clear.

Speaker image - Tapabrata Pal
Tapabrata Pal

Vice President of Architecture @Fidelity

Session

Enabling Change @ Scale Roundtable

Monday Oct 24 / 11:50AM PDT

Increasing the safe delivery of change has immense business value across a number of dimensions, so how can we improve our ability to manage change at scale?

Speaker image - Tom Wanielista
Tom Wanielista

Senior Staff Software Engineer @Lyft

Speaker image - Mykyta Protsenko
Mykyta Protsenko

Senior Software Engineer @Netflix

Speaker image - Tapabrata Pal
Tapabrata Pal

Vice President of Architecture @Fidelity

Speaker image - Javier Fernandez-Ivern
Javier Fernandez-Ivern

Staff Software Engineer @Netflix with over 20 years in Software Engineering

Session

Unconference: Architecting for Change

Monday Oct 24 / 01:40PM PDT

What is an unconference? At QCon SF, we’ll have unconferences in most of our tracks.

Speaker image - Shane Hastie
Shane Hastie

Global Delivery Lead for SoftEd and Lead Editor for Culture & Methods at InfoQ.com