Presentation: Practical Data Synchronization using CRDTs

Duration

Duration: 
4:10pm - 5:00pm

Persona:

Key Takeaways

  • Understand why the industry need better tooling to develop Offline-friendly applications and services
  • Learn the basic mechanics behind Conflict-free Replicated Data Types using examples with counters and sets
  • Become aware of common pitfalls of CRDTs and how those can be mitigated in a real system

Abstract

In a connected world, synchronising mutable information between different devices with different clock precision can be a difficult problem. A piece of data may have many out-of-sync replicas but all of those should eventually be in a consistent state. For example, TomTom users, having personal navigation devices, smartphones, MyDrive website accounts, expect their navigation information be synchronised properly even in the occasional absence of network connection. Conflict-free Replicated Data Types (CRDTs) provide robust data structures to achieve proper synchronisation in an unreliable network of devices. They enable the conflict resolution being done locally at the data type level while guaranteeing the eventual consistency between replicas.

In addition to an introduction to common CRDT types, the main focus is on the special subtype of CRDT-Set called OUR-Set (Observed, Updated, Removed), which we created to extend known CRDT sets with update functionality.

I will demonstrate basic implementations of various CRDTs in Scala and enumerate subtle considerations which should be taken into account. I will also explain the advantages of these data structures to solve many synchronisation problems as well as their limitations.

Interview

Question: 
QCon: What is your role today?
Answer: 

Dmitry: I’m a Tech Lead / Architect at TomTom working on a project called NavCloud. NavCloud is a cloud-based service for storing and seamlessly synchronizing personal data across user devices. I'm responsible for overall architecture of the platform, including infrastructure, services and its client libraries.

Question: 
QCon: At QCon SF you will talk about Data Synchronization using CRDTs. Can you give some insight on this?
Answer: 

Dmitry: The main focus of my talk is on the data synchronization domain that is a core part of the project I work on. In the "CRDTs" part I will talk about solving challenges using a modern academic achievement in a distributed systems area: Conflict-Free Replicated Data Types. These are the special data structures that are a great fit for the use-case I'm going to talk about (but not only for that). Of course, I'm going to talk from a practical, real-world perspective: how a pragmatic developer can use it. What are the limitations? What do these promises described in CRDT academic papers really mean in practice? 

Question: 
QCon: You have done this talk at Strange Loop before, am I going to get the same thing at QCon SF?
Answer: 

Dmitry: The talk at QCon SF will go deeper into the practical aspects of CRDT application. We have loads of material to cover actually and the challenge for me at the previous conference was, given time constraints, to at the same time build up knowledge of what CRDT’s are about and go into the practical part.

I am going to change the materials. I am going to change the slides and make it more practical, a little bit more hard core - in a good way, of course - with more examples. That is what people actually demanded after the Strange Loop Q&A session; because of time constraints I didn’t have enough time for the code samples there.

Question: 
QCon: Will it be a beginner, intermediate, or advanced talk?
Answer: 

Dmitry: Mostly intermediate: It doesn't assume any deep knowledge of how distributed systems work (or data synchronisation and replication in particular), and during the talk the whole nature of CRDTs is explained in plain words. The problems that our approach solves are a bit beyond the beginner level of developing software systems though.

Question: 
QCon: For whom is this talk intended?
Answer: 

Dmitry: It is a pretty technical talk, so mainly Architects, Tech Leads and Developers. The core idea of using CRDTs is absolutely platform agnostic, and is not restricted to ‘backend-only’ or ‘frontend-only’ areas. Actually, it's the opposite. The point that I try to make in the talk is that CRDTs are especially beneficial when used across the whole E2E chain: application <-> optional client library <-> service <-> storage.

Question: 
QCon: What made you decide to give a talk about CRDTs?
Answer: 

Dmitry: When we started the project in 2013, we immediately faced lots of challenges when implementing a reliable E2E application flow while supporting offline and concurrent data modifications scenarios. And I guess that happens to lots of teams.

Many of these complexities can be tackled by implementing your own solution from scratch. And that's how we started. But covering all possible edge cases is hard. We need a better way to deal with that.

And then one of my colleague learned about CRDTs, a relatively modern (I think, the first paper was published around 2011) data structure for doing distributed computations. Adopting CRDTs in our project helped us tremendously to write more reliable code (and less code really!). And, of course, to solve our business problem: to synchronize our customers data properly across unreliable networks.

Question: 
QCon: Can you give us an example of a pitfall which someone who isn’t thinking things through completely might run into?
Answer: 

Dmitry: Lots of the pitfalls are well known problems with CRDT’s. Like for example, garbage collection. CRDT’s tend to grow because of the metadata that is recorded. In case of CRDT sets, if you delete an element, you accrue that metadata but delete the element. Potentially, it is an bonded growth and that’s a huge problem that sometimes is a no-go for CRDT.

There are ways to minimize the impact. But in general, it’s something that’s really, really hard to completely solve. In the talk I will mention that the academia research also doesn’t stop on that. Initial CRDT implementation had that problem which was really, really obvious, and the abundant growth of CRDT spent on data was a serious problem.

There are other pitfalls on dealing with time. When you synchronize data, what about reliable time and how reliable do you need to have the time? Ideally, of course, there are lots of aspects you want to completely avoid dealing with, like an obstinate time world clock and then just rely on logical clocks. The problem with this approach is that you can’t work with time that is reliable enough.  But of course, you need to understand the boundaries, you need to still preserve the CRDT rules, the algebra of CRDT’s, so they should converge. I will also talk about monotonicity of time; how you can provide the monotonicity rule for the time for the single operation to assure updates build up in the right sequence order. 

If you have people integrating with your API, and your API exposes CRDT’s and you have to do that because you have different levels of API, but what if you have an integrator, that wants to use that API, wants to build an application on top of that API? The risk is he or she just completely skips the CRDT part and it loses all the benefits and  guarantees. You want to build up the whole chain with CRDT’s: you have CRDT’s on the server, in the libraries, in the application. And in that case, in the application behavior, you get all the benefits of being able to tackle the offline scenarios: being able to handle the scenarios when all of a sudden your connections break and you have to retry and everything. So CRDT’s are really useful on the clouds as well but of course when you integrate as an application developer, we as an API provider, want to provide a seamless, a smooth as possible experience for the application developer to work with CRDT’s because it is a tricky aspect.

Speaker: Dmitry Ivanov

Tech Lead @TomTom

Dmitry Ivanov is fascinated by everything related to building scalable and reliable distributed systems. Dmitry is currently a Tech Lead at TomTom, Amsterdam, where he works on the personal data synchronization and storage platform called NavCloud. Previously, Dmitry worked at Reltio, Inc. on a cloud-based MDM service, and at other startups. In his spare time he is involved in organising various programming meetups around Amsterdam, and occasionally gives talks at those.

Find Dmitry Ivanov at

.

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers