Architecting a Centralized Platform for Data Deletion at Netflix

Abstract

What does it take to safely delete data at Netflix scale? In large-scale systems, data deletion cuts across infrastructure, reliability, and performance complexities. Data lives in many datastores, each with different trade-offs and requiring ad-hoc solutions, leaving users with fragmented behavior, inconsistent outcomes, and costly operational overload. As data volumes grow and data stores become increasingly distributed and complex, ensuring safe deletion becomes an even greater challenge. Without a centralized architecture, teams often develop isolated solutions, resulting in inconsistent practices, duplicated effort, and growing operational overhead.

At Netflix, we have developed an architecture for managing data deletion across diverse data stores, addressing these challenges while improving overall system resilience. The centralized and extensible platform provides the end-to-end data deletion lifecycle from identifying the data to verifying and executing deletion. The platform includes configurable deletion controls, journaling, observability, and data recoverability to ensure safe and reliable operation.

In this talk, we share the design and execution tradeoffs behind the data deletion platform. We explain how we have used various techniques to build a reliable and auditable deletion system, and we highlight key engineering tradeoffs, including how we balance throughput, safety, and scalability across diverse systems while maintaining resilience under live traffic.

Key Takeaways:

  • Understand the architectural challenges of data deletion and why a centralized approach is essential.
  • Learn how orchestration, observability, journaling, and recoverability enable safe deletion across diverse data stores.
  • Explore the tradeoffs Netflix made to balance throughput, safety, and scalability under live traffic.
  • Gain practical insights from real-world engineering decisions in building and operating large-scale deletion workflows.

Speaker

Vidhya Arvind

Tech Lead & a Founding Architect for the Data Abstraction Platform @Netflix, Previously @Box and @Verizon

Vidhya Arvind is a Tech Lead at Netflix and a founding architect of Netflix’s cutting-edge data abstraction platform. She is a recognized expert in designing and delivering scalable, high-impact data abstractions that empower thousands of developers across the organization to move faster with confidence. With expertise in crafting robust APIs and high-performance abstractions, Vidhya drives the seamless operation of complex abstractions at massive scale. She is known for her strategic thinking, curiosity, and a systems-level mindset that fuels her passion for debugging, innovating, and solving deeply technical challenges. Vidhya has played a pivotal role in shaping the evolution of Netflix's data infrastructure, enabling mission-critical systems to run with exceptional efficiency, reliability, and resilience. Vidhya lives in the Bay Area with her family and loves hiking on trails in the area.

Read more
Find Vidhya Arvind at:

Speaker

Shawn Liu

Senior Software Engineer @Netflix, Building Reliable and Extensible Systems for Consumer Data Lifecycle at Scale

Shawn Liu is a senior software engineer at Netflix, where he builds highly available consumer identity systems and manages account and profile lifecycles at massive scale. With diverse experience in distributed systems, event-driven architectures, and high-throughput data pipelines, Shawn shapes the company-wide data lifecycle architecture, standardizing interfaces and safeguards across services and data stores. His recent work focuses on building and operationalizing a centralized, extensible deletion architecture designed for reliability and resilience at global scale.
 

Read more
Find Shawn Liu at:

From the same track

Session

How to Build an Exchange: Sub Millisecond Response Times and 24/7 Uptimes in the Cloud

Monday Nov 17 / 10:35AM PST

These days it is possible to achieve fairly good performance on cloud provisioned systems. We discuss the design of a high performance, strongly consistent system which maintains constant service in the face of regular updates to core logic.

Speaker image - Frank Yu

Frank Yu

Director of Engineering @Coinbase, Previously Principal Engineer and Director @FairX

Session

Building Resilient Platforms: Insights from 20+ Years in Mission-Critical Infrastructure

Monday Nov 17 / 11:45AM PST

In this talk, Matthew will describe lessons learned from over 20+ years of building scalable, secure and stable infrastructure platforms for software in financial services (electronic trading, credit card processing etc.), the talk is relevant to anyone building platforms for mission-critic

Speaker image - Matthew Liste

Matthew Liste

Head of Infrastructure @American Express, Previously @JPMorgan Chase and @Goldman Sachs

Session

Unconference: Architectures You've Always Wondered About

Monday Nov 17 / 05:05PM PST

Session

Compiling Workflows into Databases: The Architecture That Shouldn't Work (But Does)

Monday Nov 17 / 02:45PM PST

What if everything you know about building distributed systems is backwards?

Speaker image - Jeremy Edberg

Jeremy Edberg

CEO @DBOS, Inventor of Chaos Engineering, Tech Editor for 'AWS for Dummies', Previously Founding Reliability Engineer @Netflix and Ops @Reddit

Speaker image - Qian Li

Qian Li

Co-founder, Architect @DBOS, Stanford CS Ph.D., Co-organizer of South Bay Systems

Session

The Architecture of an Infinite Scroll

Monday Nov 17 / 03:55PM PST

Details coming soon.