Abstract
Platform teams frequently inherit systems that were never architected for their current scale, yet are so foundational that downtime can halt the business. Operating on these fragile foundations, teams face the daunting challenge of continuously shipping new features while scaling infrastructure significantly. Continuous delivery can feel risky in such critical scenarios—but avoiding it can stall progress, frustrate internal customers, and trap teams in endless rewrites that never materialize.
Drawing from his experiences leading foundational platform teams at AWS EC2 and Datadog, Ian Nowland will share practical strategies to safely implement continuous delivery, balancing reliability with innovation. Attendees will learn how to scale confidently, enhance developer productivity, and sustainably improve their platforms—even under immense pressure.
Interview:
What is your session about, and why is it important for senior software developers?
This session is about how platform teams can safely implement continuous delivery for foundational infrastructure. Systems like CI/CD, compute, networking, and service discovery are so critical you can’t afford to break them—yet they still need to evolve. These are often legacy systems that were never designed for today’s scale but now sit at the center of everything.
For senior developers—especially those who end up inheriting these systems—it’s a real trap: the pressure to innovate is high, but the blast radius is huge. I’ll share strategies we used at AWS and Datadog to keep delivering change safely, and why that’s essential to avoid stagnation, rewrites, and developer burnout.
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
We’re entering a phase where AI is accelerating everything. Teams are racing to ship new features, and developers are using AI to generate code faster than ever. The bottleneck is no longer ideation or execution—it’s the platform standing in the way of safely and quickly getting that code into production.
In the past—like during the cloud migration—platform teams could respond by building a new “V2” platform tailored to emerging use cases. But this time is different. With AI-accelerated development, nearly every team wants to move faster. Supporting just a subset of use cases for the first couple of years isn’t enough. Foundational platforms need to incrementally evolve to deliver capabilities for all users, even as they carry foundational load.
That’s why continuous delivery for these systems has become critical. It’s about enabling safe, sustainable iteration without requiring a full rewrite followed by a years-long migration by your users. The goal is to build the internal tooling and processes that allows foundational platform teams to ship, test, and recover quickly—while the business keeps moving.
What are the common challenges developers and architects face in this area?
The most common challenges I see are:
Staging environments don’t match production in either diversity or scale, which makes testing platform changes almost impossible.
Change becomes scary. Platform teams hesitate to ship—even small “quality of life” improvements—because one wrong move could bring everything down.
Grand rewrites stall out. The team starts building a “V2” but never cuts over, because the risk is too high.
Techniques like blue/green deploys, one-box testing, and traffic shadowing are well-established for stateless microservices—but often seem out of reach for foundational platforms. In this talk, I’ll cover how to bridge that gap, even when you’re working on critical, fragile systems.
What’s one thing you hope attendees will implement immediately after your talk?
Build a path to production that feels safe. That might mean introducing a shadowing mechanism. It might mean running a flaky staging use case behind a flag in production. Or it might just mean adding better observability during rollouts. But the goal is the same: get to a place where it’s safe to ship small changes continuously—even to your scariest systems.
What makes QCon stand out as a conference for senior software professionals?
As someone who’s run large platform teams and now started a company in the space, I appreciate conferences where you can talk openly about failure modes—not just success stories. QCon consistently gets those conversations right.
What was one interesting thing that you learned from a previous QCon?
In 2019, I caught Brian Cantrill’s talk, “No Moore Left to Give: Enterprise Computing after Moore’s Law.” He was one of the first to clearly articulate that the “free” gains we’ve relied on—faster chips, more efficient transistors, cheaper compute—were all slowing down. And while it wasn’t the sole focus of his talk, it was one of the first times I saw someone point to GPUs becoming essential for non-graphics (well, and non-blockchain) workloads, which feels prescient today.
Speaker

Ian Nowland
CEO @Junction Labs, Author of O'Reilly's Platform Engineering, Previously SVP Core Engineering at Datadog and Leader of AWS Nitro
Ian Nowland is the CEO and co-founder of Junction Labs, and co-author of O'Reilly’s Platform Engineering. With 25 years in software, Ian previously served as SVP of Core Engineering at Datadog during its hypergrowth phase, and spent eight formative years at AWS (2008–2016), where he led the creation and development of EMR and AWS Nitro, EC2’s virtualization platform.