Conference: Nov 13-15, 2017
Workshops: Nov 16-17, 2017
Presentation: Practical Data Synchronization using CRDTs
Key Takeaways
- Understand why the industry need better tooling to develop Offline-friendly applications and services
- Learn the basic mechanics behind Conflict-free Replicated Data Types using examples with counters and sets
- Become aware of common pitfalls of CRDTs and how those can be mitigated in a real system
Abstract
In a connected world, synchronising mutable information between different devices with different clock precision can be a difficult problem. A piece of data may have many out-of-sync replicas but all of those should eventually be in a consistent state. For example, TomTom users, having personal navigation devices, smartphones, MyDrive website accounts, expect their navigation information be synchronised properly even in the occasional absence of network connection. Conflict-free Replicated Data Types (CRDTs) provide robust data structures to achieve proper synchronisation in an unreliable network of devices. They enable the conflict resolution being done locally at the data type level while guaranteeing the eventual consistency between replicas.
In addition to an introduction to common CRDT types, the main focus is on the special subtype of CRDT-Set called OUR-Set (Observed, Updated, Removed), which we created to extend known CRDT sets with update functionality.
I will demonstrate basic implementations of various CRDTs in Scala and enumerate subtle considerations which should be taken into account. I will also explain the advantages of these data structures to solve many synchronisation problems as well as their limitations.
Interview
Dmitry: I’m a Tech Lead / Architect at TomTom working on a project called NavCloud. NavCloud is a cloud-based service for storing and seamlessly synchronizing personal data across user devices. I'm responsible for overall architecture of the platform, including infrastructure, services and its client libraries.
Dmitry: The main focus of my talk is on the data synchronization domain that is a core part of the project I work on. In the "CRDTs" part I will talk about solving challenges using a modern academic achievement in a distributed systems area: Conflict-Free Replicated Data Types. These are the special data structures that are a great fit for the use-case I'm going to talk about (but not only for that). Of course, I'm going to talk from a practical, real-world perspective: how a pragmatic developer can use it. What are the limitations? What do these promises described in CRDT academic papers really mean in practice?
Dmitry: The talk at QCon SF will go deeper into the practical aspects of CRDT application. We have loads of material to cover actually and the challenge for me at the previous conference was, given time constraints, to at the same time build up knowledge of what CRDT’s are about and go into the practical part.
I am going to change the materials. I am going to change the slides and make it more practical, a little bit more hard core - in a good way, of course - with more examples. That is what people actually demanded after the Strange Loop Q&A session; because of time constraints I didn’t have enough time for the code samples there.
Dmitry: Mostly intermediate: It doesn't assume any deep knowledge of how distributed systems work (or data synchronisation and replication in particular), and during the talk the whole nature of CRDTs is explained in plain words. The problems that our approach solves are a bit beyond the beginner level of developing software systems though.
Dmitry: It is a pretty technical talk, so mainly Architects, Tech Leads and Developers. The core idea of using CRDTs is absolutely platform agnostic, and is not restricted to ‘backend-only’ or ‘frontend-only’ areas. Actually, it's the opposite. The point that I try to make in the talk is that CRDTs are especially beneficial when used across the whole E2E chain: application <-> optional client library <-> service <-> storage.
Dmitry: When we started the project in 2013, we immediately faced lots of challenges when implementing a reliable E2E application flow while supporting offline and concurrent data modifications scenarios. And I guess that happens to lots of teams.
Many of these complexities can be tackled by implementing your own solution from scratch. And that's how we started. But covering all possible edge cases is hard. We need a better way to deal with that.
And then one of my colleague learned about CRDTs, a relatively modern (I think, the first paper was published around 2011) data structure for doing distributed computations. Adopting CRDTs in our project helped us tremendously to write more reliable code (and less code really!). And, of course, to solve our business problem: to synchronize our customers data properly across unreliable networks.
Dmitry: Lots of the pitfalls are well known problems with CRDT’s. Like for example, garbage collection. CRDT’s tend to grow because of the metadata that is recorded. In case of CRDT sets, if you delete an element, you accrue that metadata but delete the element. Potentially, it is an bonded growth and that’s a huge problem that sometimes is a no-go for CRDT.
There are ways to minimize the impact. But in general, it’s something that’s really, really hard to completely solve. In the talk I will mention that the academia research also doesn’t stop on that. Initial CRDT implementation had that problem which was really, really obvious, and the abundant growth of CRDT spent on data was a serious problem.
There are other pitfalls on dealing with time. When you synchronize data, what about reliable time and how reliable do you need to have the time? Ideally, of course, there are lots of aspects you want to completely avoid dealing with, like an obstinate time world clock and then just rely on logical clocks. The problem with this approach is that you can’t work with time that is reliable enough. But of course, you need to understand the boundaries, you need to still preserve the CRDT rules, the algebra of CRDT’s, so they should converge. I will also talk about monotonicity of time; how you can provide the monotonicity rule for the time for the single operation to assure updates build up in the right sequence order.
If you have people integrating with your API, and your API exposes CRDT’s and you have to do that because you have different levels of API, but what if you have an integrator, that wants to use that API, wants to build an application on top of that API? The risk is he or she just completely skips the CRDT part and it loses all the benefits and guarantees. You want to build up the whole chain with CRDT’s: you have CRDT’s on the server, in the libraries, in the application. And in that case, in the application behavior, you get all the benefits of being able to tackle the offline scenarios: being able to handle the scenarios when all of a sudden your connections break and you have to retry and everything. So CRDT’s are really useful on the clouds as well but of course when you integrate as an application developer, we as an API provider, want to provide a seamless, a smooth as possible experience for the application developer to work with CRDT’s because it is a tricky aspect.
Similar Talks
.
Tracks
Monday Nov 7
-
Architectures You've Always Wondered About
You know the names. Now learn lessons from their architectures
-
Distributed Systems War Stories
“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Lamport.
-
Containers Everywhere
State of the art in Container deployment, management, scheduling
-
Art of Relevancy and Recommendations
Lessons on the adoption of practical, real-world machine learning practices. AI & Deep learning explored.
-
Next Generation Web Standards, Frameworks, and Techniques
JavaScript, HTML5, WASM, and more... innovations targetting the browser
-
Optimize You
Keeping life in balance is a challenge. Learn lifehacks, tips, & techniques for success.
Tuesday Nov 8
-
Next Generation Microservices
What will microservices look like in 3 years? What if we could start over?
-
Java: Are You Ready for This?
Real world lessons & prepping for JDK9. Reactive code in Java today, Performance/Optimization, Where Unsafe is heading, & JVM compile interface.
-
Big Data Meets the Cloud
Overviews and lessons learned from companies that have implemented their Big Data use-cases in the Cloud
-
Evolving DevOps
Lessons/stories on optimizing the deployment pipeline
-
Software Engineering Softskills
Great engineers do more than code. Learn their secrets and level up.
-
Modern CS in the Real World
Applied, practical, & real-world dive into industry adoption of modern CS ideas
Wednesday Nov 9
-
Architecting for Failure
Your system will fail. Take control before it takes you with it.
-
Stream Processing
Stream Processing, Near-Real Time Processing
-
Bare Metal Performance
Native languages, kernel bypass, tooling - make the most of your hardware
-
Culture as a Differentiator
The why and how for building successful engineering cultures
-
//TODO: Security <-- fix this
Building security from the start. Stories, lessons, and innovations advancing the field of software security.
-
UX Reimagined
Bots, virtual reality, voice, and new thought processes around design. The track explores the current art of the possible in UX and lessons from early adoption.