CRDTs in Production | QCon San Francisco 2018

Next QConSF Conference: Applied AI for Developers QCon.ai April 2019

What You’ll Learn

Hear how PayPal developed a distributed system dealing with consistency issues.
Find out what scenarios eventual consistency works for.
Learn that eventual consistency is doable and why it matters.

Abstract

In search of scalability and availability improvements, many companies adopt eventual consistency as the consistency model underlying their stateful systems and persistent data stores. At the same time, software designers are focused on creating resilient systems ready to work in production with minimal complexity. Dmitry will share lessons learned in developing a distributed system based on an eventually consistent data store. The final solution utilizes conflict-free, replicated data types with causality tracking to achieve strong eventual consistency for critical data in multi-master, multi-datacenter DB (Aerospike) deployments.

Question:

What's the focus of the work that you're doing today?

Answer:

This project on CRDT was my first project at PayPal and I was part of PD platform team. This is a team on top of infrastructure which provides services which will be used by product development engineers. Our platform allows us to drive the product development aligned with existing infrastructure and at the same time, we need to be efficient in throughput and consistency. We designed this CRDT solution specifically for our situation for our partners and for our infrastructure current state. There is also a requirement how many dependencies we can afford. For example, why CRDT was a very good fit because we say that we don't know the configuration of the cluster. We don't know how many data centers they have. This immediately removed the possibility to use consensus protocols because consensus is based on n divided by 2 plus 1 nodes and we don't have this capability on the infrastructure side. From the box it means that we need to implement it but we don't know about infrastructure. That's why we tried to build a reliable solution on not very reliable components.

Question:

What was the specific use case that you were solving?

Answer:

We are talking about compliance statuses. This is something like whether you're verified, whether you can or cannot transact.

Question:

Who is the core audience you're talking to?

Answer:

I'm talking to product architects who build product solutions like services. They might live in cloud or on premise infrastructure, but they do not always have enough control, how many databases, how databases are shared, how the network is organized. This is the reality that we face today in most companies. You just have solutions. And if there is a spike on other databases we say 'We provide consistency for this and that, but for anything else, we do not provide it', and you need to have something reliable.

Question:

So the reason why CRDT has made the most sense for PayPal is because you didn't have the exact number of the quorum because the infrastructure could change?

Answer:

Exactly. And we also need remote deployment. This means that nodes should be able to work in isolation for some period of time. And if the majority of the cluster is not accessible for this node we should still be able to handle requests.

Question:

Why is this talk in the production readiness track?

Answer:

Because as product developers we want to be able to provide high quality of services, high availability regardless of what's going on in infrastructure. We want to achieve maximum efficiency with the components that we have today, and with the anticipation of what these components behavior might be.

Question:

Is this an advanced or intermediate talk?

Answer:

I would say this is an advanced talk.

Question:

What do you want someone to walk away from this talk with?

Answer:

First of all, I want someone to walk away knowing that eventual consistency is feasible. Recently I was in a FoundationDB talk, and somebody said that eventual consistency equals no consistency. I disagree with that. This is not easy. This has to be designed well and to think about the access patterns. It is not generic. It is very specific for some particular case. You need to have assessment of what the root cause of your concurrency issue is. But it is feasible and it allows you to avoid the issues resulting from network latency because I think that in the current scale and pace of interaction between people and technology synchronized solutions are very limited.

Speaker: Dmitry Martyanov

Software Engineer @PayPal

Software engineer at PayPal working with a focus on distributed systems and resilient architectures.

Find Dmitry Martyanov at

Speaker page

Security Researcher, Leader, Advisor @Netflix

William Bengtson

Reducing Risk of Credential Compromise @Netflix

Sr. Cloud Security Engineer @Netflix

Travis McPeak

Taking the Canary Out of the Coal Mine

Staff Security Engineer @Cruise Automation

Mike Ruth

Using Data to Measure Risk in Cyber Systems

Director of Cyber Risk @QadiumInc

Marshall Kuypers

Security & Psychology: Demotivating Persistent Threats

Engineering Director @ShapeSecurity & JavaScript Expert

Jarrod Overson

Fairness, Transparency, and Privacy in AI @LinkedIn

Tech Lead Fairness, Transparency, Explainability & Privacy Efforts @LinkedIn

Krishnaram Kenthapadi

Jupyter Notebooks: Interactive Visualization Approaches

Senior Researcher in the Quantitative Financial Research Group @Bloomberg

Chakri Cherukuri

Nearline Recommendations for Active Communities @LinkedIn

Senior Manager & Heading AI for Growth and Communication Relevance @LinkedIn

Hema Raghavan

Open Source Robotics: Hands on with Gazebo and ROS 2

Software Engineer @OpenRoboticsOrg

Louise Poubel

Tracks

Monday, 5 November

Microservices / Serverless Patterns & Practices

Evolving, observing, persisting, and building modern microservices
Practices of DevOps & Lean Thinking

Practical approaches using DevOps & Lean Thinking
JavaScript & Web Tech

Beyond JavaScript in the Browser. Exploring WebAssembly, Electron, & Modern Frameworks
Modern CS in the Real World

Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probabilistic programming
Modern Operating Systems

Applied, practical, & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on Windows, LinuxKit, and Unikernels
Optimizing You: Human Skills for Individuals

Better teams start with a better self. Learn practical skills for IC

Tuesday, 6 November

Architectures You've Always Wondered About

Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more
21st Century Languages

Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
Emerging Trends in Data Engineering

Showcasing DataEng tech and highlighting the strengths of each in real-world applications.
Bare Knuckle Performance

Killing latency and getting the most out of your hardware
Socially Conscious Software

Building socially responsible software that protects users privacy & safety
Delivering on the Promise of Containers

Runtime containers, libraries, and services that power microservices

Wednesday, 7 November

Applied AI & Machine Learning

Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, PyTorch, & more
Production Readiness: Building Resilient Systems

More than just building software, building deployable production ready software
Developer Experience: Level up your Engineering Effectiveness

Improving the end to end developer experience - design, dev, test, deploy, operate/understand.
Security: Lessons Attacking & Defending

Security from the defender's AND the attacker's point of view
Future of Human Computer Interaction

IoT, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
Enterprise Languages

Workhorse languages found in modern enterprises. Expect Java, .NET, & Node in this track

This Year's Schedule

The all-new QCon app!

Available on iOS and Android

The new QCon app helps you make the most of your conference experience. Easily browse and follow the conference schedule, star the talks you want to attend, and keep tabs on your personal itinerary. Download the app now for free on iOS and Android.

Track: Production Readiness: Building Resilient Systems

Location: Ballroom A

Duration: 2:55pm - 3:45pm

Day of week: Wednesday

Level: Intermediate - Advanced

Persona: Backend Developer, Developer

What You’ll Learn

Abstract

Speaker: Dmitry Martyanov

Find Dmitry Martyanov at

Similar Talks

Tracks

Monday, 5 November

Microservices / Serverless Patterns & Practices

Practices of DevOps & Lean Thinking

JavaScript & Web Tech

Modern CS in the Real World

Modern Operating Systems

Optimizing You: Human Skills for Individuals

Tuesday, 6 November

Architectures You've Always Wondered About

21st Century Languages

Emerging Trends in Data Engineering

Bare Knuckle Performance

Socially Conscious Software

Delivering on the Promise of Containers

Wednesday, 7 November

Applied AI & Machine Learning

Production Readiness: Building Resilient Systems

Developer Experience: Level up your Engineering Effectiveness

Security: Lessons Attacking & Defending

Future of Human Computer Interaction

Enterprise Languages

The all-new QCon app!

Available on iOS and Android

Presentation: CRDTs in Production

Track: Production Readiness: Building Resilient Systems

Location: Ballroom A

Duration: 2:55pm - 3:45pm

Day of week: Wednesday

Level: Intermediate - Advanced

Persona: Backend Developer, Developer

More talks on:

Share this on:

What You’ll Learn

Abstract

Speaker: Dmitry Martyanov

Find Dmitry Martyanov at

Similar Talks

Tracks

Monday, 5 November

Tuesday, 6 November

Wednesday, 7 November

The all-new QCon app!

Available on iOS and Android