The online world we interact with today is increasingly powered by data and by insights extracted from that data. Our ever-growing thirst for data insights and data-driven behavior (e.g. ML-based systems) is driving our industry to collect data more often from an increasingly varied set of sources. With increased amounts of data, scale becomes a challenge. To complicate matters further, customers want reliable access to high-quality data and insights. This adds availability and data quality to our list of requirements. More often than not, customers require low-latency as well, often referring to the time it takes raw data to be converted into usable insights or production-grade models. Last but not least, access patterns and use-cases dictate the form data will take when being served!
Depending on how the data will be used, the medium used to store and serve it will vary widely. OLTP/OLAP DBs, caches, object stores, search engines, graph DBs, data streams, vector DBs, and the like represent the many forms data takes to be suitable to its many uses. Come to this track to learn about new technologies, practices, and trends shaping the way you will work with data.
From this track
LIquid: A Large-Scale Relational Graph Database
Monday Oct 2 / 10:35AM PDT
We describe LIquid(1 2), the graph database built to host LinkedIn.
Scott Meyer
Distinguished Software Engineer @LinkedIn, Creator of the Graph Database, LIquid, Metaweb/freebase Alum
PRQL: A Simple, Powerful, Pipelined SQL Replacement
Monday Oct 2 / 11:45AM PDT
Most databases use SQL as the interface to access relational data. Because of that, we associate SQL to be the language of relational algebra. But its affinity with the English language and unclear and inconsistent semantics leave a lot of space for improvements.
Aljaž Mur Eržen
Compiler Developer @EdgeDB & PRQL Maintainer
Streaming Databases: Embracing the Convergence of Stream Processing and Databases
Monday Oct 2 / 01:35PM PDT
Streaming databases have gained significant attention in recent years. From its name, it is evident that a streaming database combines the power of stream processing and databases.
Yingjun Wu
Founder and CEO @RisingWave Labs, Previously Engineer @AWS Redshift & Researcher @IBM Research Almaden
Redesigning OLTP for a New Order of Magnitude
Monday Oct 2 / 02:45PM PDT
The world is becoming more transactional. From colocation and server rental to serverless and usage-based billing. From coal to clean energy and smart meters that arbitrage solar prices 1440 times a month instead of monthly. Not to mention FedNow or the tsunami of instant payments.
Joran Greef
Founder and CEO @TigerBeetle
Incremental Data Processing with Apache Hudi
Monday Oct 2 / 03:55PM PDT
Incremental Data Processing is an emerging style of data processing gathering attention recently that has the potential to deliver orders of magnitude speed and efficiency over traditional batch processing on data lakes and data warehouses.
Saketh Chintapalli
Software Engineer @Uber, Bringing Incremental Data Processing to Data Warehouse Models
Bhavani Sudha Saktheeswaran
Distributed Systems Engineer @Onehouse, Apache Hudi PMC, Ex-Moveworks, Ex-Uber, Ex-Linkedin
Sleeping at Scale - Delivering 10k Timers per Second per Node with Rust, Tokio, Kafka, and Scylla
Monday Oct 2 / 05:05PM PDT
As a part of OneSignal’s no-code Journeys system, we knew that we would need a way to store billions of timers.
Lily Mara
Engineering Manager @OneSignal, Author of "Refactoring to Rust"
Hunter Laine
Software Engineer @OneSignal