Abstract
As companies grow, so does the complexity of keeping distributed systems in sync. At DoorDash, we tackled this challenge while building a high-throughput, domain-oriented data platform for capturing changes across hundreds of services.
Instead of relying on traditional Change Data Capture (CDC) mechanisms, we designed a Write-Ahead Intent Log—a lightweight, domain-scoped event stream that records write intents before state is finalized. This intent-first design acts as a durable buffer between writers and downstream consumers, enabling scalable, resilient CDC without tight coupling to database internals or the need for full mutation history.
In this talk, we’ll explore:
- Efficiency: How publishing write intents instead of raw state changes shrinks payload size, reduces coordination overhead, and simplifies downstream processing.
- Performance: Techniques like per-key concurrency control, progressive consistency reads, and partition-aware retries allow us to achieve under 1s tail latencies at up to 1M writes per second per table.
- Maintainability: A Protobuf-based key-value schema abstraction that’s easily consumed by polyglot teams, with built-in support for dead-letter queues, bounded retries, and future-facing features like schema evolution via Proto + schema registry.
We’ll also share how this approach helped us avoid pitfalls like head-of-line blocking and schema drift—without relying on heavyweight infrastructure.
Key Takeaways:
- Intent-First Logging Enables Loose Coupling: By separating write intent from final state, you can decouple services cleanly and unlock asynchronous integrations without overloading databases.
- Throughput and Latency Can Coexist: With the right concurrency controls and retry strategies, it's possible to achieve sub-second latencies even at millions of writes per second per table.
- Simplicity Scales: A domain-scoped, schema-defined log format is easier to evolve and operate than opaque change logs tied to database internals.