Redesigning OLTP for a New Order of Magnitude

The world is becoming more transactional.

From colocation and server rental to serverless and usage-based billing. From coal to clean energy and smart meters that arbitrage solar prices 1440 times a month instead of monthly. Not to mention FedNow or the tsunami of instant payments.

The volume of OLTP transactions across several sectors has grown by three orders of magnitude.

And yet two of the most popular open source OLTP database management systems in deployment are 30 years old, designed for a different world, and a different scale.

We have seen incredible advances in hardware and DBMS research since then.

There are hints of the need to redesign OLTP for a new order of magnitude: the move to proprietary cloud databases, and a creeping dependence on caching, even to the extreme of replacing the OLTP DBMS entirely with distributed microservices.

If this sounds high level, be warned that this is a deeply technical talk.

We're going to redesign OLTP from the ground up, with TigerBeetle, a new open source distributed financial transactions database, as a case study to see:

  • Why OLTP has a growing impedance mismatch.
  • Why the OLTP workload is becoming more contentious.
  • Why row locks, horizontal sharding, and betting on the speed of light in fiber can't compete with “diagonal scaling”: Moore's law and vertical scaling, together with the disaggregation of storage and compute.
  • What the last decade has taught us about log structured merge trees as the local storage engine for OLTP. Why storage faults, write stalls, and non-determinism are now a problem. And how to exploit workload and reduce write amplification by moving from an LSM-Tree to an LSM-Forest.
  • The challenge of strict serializability, mission-critical durability and high availability at scale. Why we can do better than off-the-shelf consensus protocols such as Raft and MultiPaxos, with new techniques such as Protocol-Aware Recovery, low-latency batching, out-of-order replication with in-order commitment, and optimistic state machine execution.
  • Finally—with the pure predictability of static memory allocation, the joy of a world without memory fragmentation, and the silver bullet of Deterministic Simulation Testing—why the future of OLTP is looking bright!

Speaker

Joran Greef

Founder and CEO @TigerBeetle

Joran Dirk Greef is the Founder and CEO of TigerBeetle, the distributed financial accounting database designed for mission-critical safety and performance. His interests are storage, speed, and safety.

Read more
Find Joran Greef at:

Date

Monday Oct 2 / 02:45PM PDT ( 50 minutes )

Location

Ballroom A

Topics

Distributed Systems Database Management Systems Architecture

Share

From the same track

Session Stream Processing

Streaming Databases: Embracing the Convergence of Stream Processing and Databases

Monday Oct 2 / 01:35PM PDT

Streaming databases have gained significant attention in recent years. From its name, it is evident that a streaming database combines the power of stream processing and databases.

Speaker image - Yingjun Wu
Yingjun Wu

Founder and CEO @RisingWave Labs, Previously Engineer @AWS Redshift & Researcher @IBM Research Almaden

Session Graph Databases

LIquid: A Large-Scale Relational Graph Database

Monday Oct 2 / 10:35AM PDT

We describe LIquid(1 2), the graph database built to host LinkedIn.

Speaker image - Scott Meyer
Scott Meyer

Distinguished Software Engineer @LinkedIn, Creator of the Graph Database, LIquid, Metaweb/freebase Alum

Session Data Lakes

Incremental Data Processing with Apache Hudi

Monday Oct 2 / 03:55PM PDT

Incremental Data Processing is an emerging style of data processing gathering attention recently that has the potential to deliver orders of magnitude speed and efficiency over traditional batch processing on data lakes and data warehouses.

Speaker image - Saketh Chintapalli
Saketh Chintapalli

Software Engineer @Uber, Bringing Incremental Data Processing to Data Warehouse Models

Speaker image - Bhavani Sudha Saktheeswaran
Bhavani Sudha Saktheeswaran

Distributed Systems Engineer @Onehouse, Apache Hudi PMC, Ex-Moveworks, Ex-Uber, Ex-Linkedin

Session Architecture

Sleeping at Scale - Delivering 10k Timers per Second per Node with Rust, Tokio, Kafka, and Scylla

Monday Oct 2 / 05:05PM PDT

As a part of OneSignal’s no-code Journeys system, we knew that we would need a way to store billions of timers.

Speaker image - Lily Mara
Lily Mara

Engineering Manager @OneSignal, Author of "Refactoring to Rust"

Speaker image - Hunter Laine
Hunter Laine

Software Engineer @OneSignal

Session Data

PRQL: A Simple, Powerful, Pipelined SQL Replacement

Monday Oct 2 / 11:45AM PDT

Most databases use SQL as the interface to access relational data. Because of that, we associate SQL to be the language of relational algebra. But its affinity with the English language and unclear and inconsistent semantics leave a lot of space for improvements.

Speaker image - Aljaž Mur Eržen
Aljaž Mur Eržen

Compiler Developer @EdgeDB & PRQL Maintainer