Live Resharding Without Regret: Lessons from Building Valkey's Atomic Slot Migration

Abstract

Sharding is easy. Resharding under heavy load is notoriously difficult. How do you move gigabytes of state across live database nodes without dropping keys, blocking the main event loop, or breaking client abstractions?

Using Valkey and Redis as case studies, we will survey different resharding architectures and dive deep into Valkey's new Atomic Slot Migration. We'll walk through the practical tradeoffs of these approaches, covering client redirections (MOVED/ASK), fork-based slot snapshotting, and rollback staging. Along the way, we'll shine a light on the rough edge cases that actually matter in production.


Speaker

Jacob Murphy

Open Source Maintainer @Valkey & Software Engineer @Google Cloud's Memorystore Team

Jacob Murphy is a Valkey project maintainer and Technical Steering Committee member. During the day, he is a Staff Software Engineer on Google Cloud's Memorystore team, where he sets the technical direction for Google's managed in-memory databases, including Redis, Valkey, and Memcached. His technical focus is on distributed systems, high availability, and scaling cache infrastructure. In his spare time, Jacob enjoys road biking, hiking, and tackling DIY home repair projects.

Read more
Find Jacob Murphy at: