Write-Ahead Intent Log: A Foundation for Efficient CDC at Scale

Abstract

As companies grow, so does the complexity of keeping distributed systems in sync. At DoorDash, we tackled this challenge while building a high-throughput, domain-oriented data platform for capturing changes across hundreds of services.

Instead of relying on traditional Change Data Capture (CDC) mechanisms, we designed a Write-Ahead Intent Log—a lightweight, domain-scoped event stream that records write intents before state is finalized. This intent-first design acts as a durable buffer between writers and downstream consumers, enabling scalable, resilient CDC without tight coupling to database internals or the need for full mutation history.

In this talk, we’ll explore:

  • Efficiency: How publishing write intents instead of raw state changes shrinks payload size, reduces coordination overhead, and simplifies downstream processing.
  • Performance: Techniques like per-key concurrency control, progressive consistency reads, and partition-aware retries allow us to achieve under 1s tail latencies at up to 1M writes per second per table.
  • Maintainability: A Protobuf-based key-value schema abstraction that’s easily consumed by polyglot teams, with built-in support for dead-letter queues, bounded retries, and future-facing features like schema evolution via Proto + schema registry.

We’ll also share how this approach helped us avoid pitfalls like head-of-line blocking and schema drift—without relying on heavyweight infrastructure.

Key Takeaways:

  1. Intent-First Logging Enables Loose Coupling: By separating write intent from final state, you can decouple services cleanly and unlock asynchronous integrations without overloading databases.
  2. Throughput and Latency Can Coexist: With the right concurrency controls and retry strategies, it's possible to achieve sub-second latencies even at millions of writes per second per table.
  3. Simplicity Scales: A domain-scoped, schema-defined log format is easier to evolve and operate than opaque change logs tied to database internals.

Date

Wednesday Nov 19 / 03:55PM PST ( 50 minutes )

Location

Ballroom A

Share

From the same track

Session

How Netflix Shapes our Fleet for Efficiency and Reliability

Wednesday Nov 19 / 11:45AM PST

Netflix runs on a complex multi-layer cloud architecture made up of thousands of services, caches, and databases. As hardware options, workload patterns, cost dynamics and the Netflix products evolve, the cost-optimal hardware and configuration for running our services is constantly changing.

Speaker image - Joseph Lynch

Joseph Lynch

Principal Software Engineer @Netflix Building Highly-Reliable and High-Leverage Infrastructure Across Stateless and Stateful Services

Speaker image - Argha C

Argha C

Staff Software Engineer @Netflix - Leading Netflix's Cloud Scalability Efforts for Live

Session

Realtime and Batch Processing of GPU Workloads

Wednesday Nov 19 / 01:35PM PST

SS&C Technologies runs 47 trillion dollars of assets on our global private cloud. We have the primitives for infrastructure as well as platforms as a service like Kubernetes, Kafka, NiFi, Databases, etc.

Speaker image - Joseph Stein

Joseph Stein

Principal Architect of Research & Development @SS&C Technologies, Previous Apache Kafka Committer and PMC Member

Session

From ms to µs: OSS Valkey Architecture Patterns for Modern AI

Wednesday Nov 19 / 02:45PM PST

As AI applications demand faster and more intelligent data access, traditional caching strategies are hitting performance and reliability limits. 

Speaker image - Dumanshu Goyal

Dumanshu Goyal

Uber Technical Lead @Airbnb Powering $11B Transactions, Formerly @Google and @AWS

Session

One Platform to Serve Them All: Autoscaling Multi-Model LLM Serving

Wednesday Nov 19 / 10:35AM PST

AI teams are moving to self-hosted inference away from hosted LLMs as fine-tuning drives model performance. The catch is scale, hundreds of variants create long-tail traffic, cold starts, and duplicated stacks.

Speaker image - Meryem Arik

Meryem Arik

Co-Founder and CEO @Doubleword (Previously TitanML), Recognized as a Technology Leader in Forbes 30 Under 30, Recovering Physicist