From ms to µs: OSS Valkey Architecture Patterns for Modern AI

Abstract

As AI applications demand faster and more intelligent data access, traditional caching strategies are hitting performance and reliability limits. 

This talk presents architecture patterns that shift from milliseconds to microseconds using Valkey Cluster, an open-source, Redis-compatible, in-memory datastore. Learn when to use proxy-based versus direct-access caching, how to avoid hidden reliability issues in sharded systems, and how to optimize for high price-performance at scale. Backed by the Linux Foundation, Valkey offers rich data structures and community-driven innovation. 

Whether you’re building GenAI services or scaling existing platforms, this session delivers actionable patterns to improve speed, resilience, and efficiency.


Speaker

Dumanshu Goyal

Uber Technical Lead @Airbnb Powering $11B Transactions, Formerly @Google and @AWS

Dumanshu Goyal leads Online Data Priorities at Airbnb, powering its $11B transaction platform and building the next generation of the company’s data systems. Previously, he led in-memory caching for Google Cloud Databases, delivering 10x improvements in scale and price-performance for Google Cloud Memorystore, one of the rare times “10x” was more than a slide promise. Before that, he spent 10 years at AWS as the founding engineer of AWS Timestream, a serverless time-series database, and architected durability and availability features for one of the internet’s foundational services, AWS DynamoDB, ensuring data stayed available even when Wi-Fi did not.

An expert in building and operationalizing large-scale distributed systems, Dumanshu holds 20 patents and brings deep experience in architecting the resilient infrastructure that underpins today’s digital world.

Read more
Find Dumanshu Goyal at:

From the same track

Session

How Netflix Shapes our Fleet for Efficiency and Reliability

Wednesday Nov 19 / 11:45AM PST

Netflix runs on a complex multi-layer cloud architecture made up of thousands of services, caches, and databases. As hardware options, workload patterns, cost dynamics and the Netflix products evolve, the cost-optimal hardware and configuration for running our services is constantly changing.

Speaker image - Joseph Lynch

Joseph Lynch

Principal Software Engineer @Netflix Building Highly-Reliable and High-Leverage Infrastructure Across Stateless and Stateful Services

Speaker image - Argha C

Argha C

Staff Software Engineer @Netflix - Leading Netflix's Cloud Scalability Efforts for Live

Session

Realtime and Batch Processing of GPU Workloads

Wednesday Nov 19 / 01:35PM PST

SS&C Technologies runs 47 trillion dollars of assets on our global private cloud. We have the primitives for infrastructure as well as platforms as a service like Kubernetes, Kafka, NiFi, Databases, etc.

Speaker image - Joseph Stein

Joseph Stein

Principal Architect of Research & Development @SS&C Technologies, Previous Apache Kafka Committer and PMC Member

Session

One Platform to Serve Them All: Autoscaling Multi-Model LLM Serving

Wednesday Nov 19 / 10:35AM PST

AI teams are moving to self-hosted inference away from hosted LLMs as fine-tuning drives model performance. The catch is scale, hundreds of variants create long-tail traffic, cold starts, and duplicated stacks.

Speaker image - Meryem Arik

Meryem Arik

Co-Founder and CEO @Doubleword (Previously TitanML), Recognized as a Technology Leader in Forbes 30 Under 30, Recovering Physicist

Session

Cost-Conscious Cloud: Designing Systems that Don't Break the Bank

Wednesday Nov 19 / 03:55PM PST

Details coming soon.