Sleeping at Scale - Delivering 10k Timers per Second per Node with Rust, Tokio, Kafka, and Scylla

As a part of OneSignal’s no-code Journeys system, we knew that we would need a way to store billions of timers. OneSignal delivers over 11 billion notifications per day, and we would potentially need to store a timer event alongside each of these for customer-defined journeys like “send a notification, wait 24h, send another notification if a condition is met.” After searching the paid and open-source worlds for off-the-shelf solutions, we decided that our best option was to build a system that could store and expire timers for ourselves. Ingesting data via gRPC, persisting to Scylla, and expiring to Kafka via a Rust scheduler has proved to be a great solution for us. This talk walks through the design of this system, its performance characteristics, and how we’ve scaled it since it began as a single-node service for the first year and a half of its existence.


Speaker

Lily Mara

Engineering Manager @OneSignal, Author of "Refactoring to Rust"

Lily Mara is an Engineering Manager at OneSignal in San Mateo, CA. She manages the Core Services team, which is responsible for in-house services used by other OneSignal engineering teams. Previously she was a software engineer at OneSignal, leading the efforts to create OneSignal's integration with Mixpanel, develop the outcomes system, and improving performance and code simplicity through refactoring efforts. Lily also worked as a software developer at Kroger, working on Kroger’s online grocery ordering system as well as internal development tools to aid other teams in deployments, monitoring, and local development environments.

Lily is the author of Refactoring to Rust, an early-access book by Manning Publications about improving the performance of existing software systems through the gradual addition of Rust code.

 

Read more
Find Lily Mara at:

Speaker

Hunter Laine

Software Engineer @OneSignal

Hunter Laine is a Software Engineer on the Infrastructure Services team at OneSignal in San Mateo, CA. In this role, she helps to maintain and improve in-house services used by internal engineering teams. In this capacity, Hunter led a major migration of one of OneSignal’s most trafficked endpoints from a legacy Ruby on Rails codebase into a faster, more efficient and maintainable Go codebase. She spearheaded scaling OneSignal’s timer scheduler infrastructure from a single-tenant service to a multi-tenant service.

Prior to her work at OneSignal, Hunter has held positions as a Marketing Operations Manager abroad. OneSignal has been an incredible place to learn quickly and take a lead role in major initiatives.

 

Read more

Date

Monday Oct 2 / 05:05PM PDT ( 50 minutes )

Location

Ballroom A

Topics

Architecture Rust Concurrency Database

Share

From the same track

Session Stream Processing

Streaming Databases: Embracing the Convergence of Stream Processing and Databases

Monday Oct 2 / 01:35PM PDT

Streaming databases have gained significant attention in recent years. From its name, it is evident that a streaming database combines the power of stream processing and databases.

Speaker image - Yingjun Wu

Yingjun Wu

Founder and CEO @RisingWave Labs, Previously Engineer @AWS Redshift & Researcher @IBM Research Almaden

Session Graph Databases

LIquid: A Large-Scale Relational Graph Database

Monday Oct 2 / 10:35AM PDT

We describe LIquid(1 2), the graph database built to host LinkedIn.

Speaker image - Scott Meyer

Scott Meyer

Distinguished Software Engineer @LinkedIn, Creator of the Graph Database, LIquid, Metaweb/freebase Alum

Session Distributed Systems

Redesigning OLTP for a New Order of Magnitude

Monday Oct 2 / 02:45PM PDT

The world is becoming more transactional. From colocation and server rental to serverless and usage-based billing. From coal to clean energy and smart meters that arbitrage solar prices 1440 times a month instead of monthly. Not to mention FedNow or the tsunami of instant payments.

Speaker image - Joran Greef

Joran Greef

Founder and CEO @TigerBeetle

Session Data Lakes

Incremental Data Processing with Apache Hudi

Monday Oct 2 / 03:55PM PDT

Incremental Data Processing is an emerging style of data processing gathering attention recently that has the potential to deliver orders of magnitude speed and efficiency over traditional batch processing on data lakes and data warehouses.

Speaker image - Saketh Chintapalli

Saketh Chintapalli

Software Engineer @Uber, Bringing Incremental Data Processing to Data Warehouse Models

Speaker image - Bhavani Sudha Saktheeswaran

Bhavani Sudha Saktheeswaran

Distributed Systems Engineer @Onehouse, Apache Hudi PMC, Ex-Moveworks, Ex-Uber, Ex-Linkedin

Session Data

PRQL: A Simple, Powerful, Pipelined SQL Replacement

Monday Oct 2 / 11:45AM PDT

Most databases use SQL as the interface to access relational data. Because of that, we associate SQL to be the language of relational algebra. But its affinity with the English language and unclear and inconsistent semantics leave a lot of space for improvements.

Speaker image - Aljaž Mur Eržen

Aljaž Mur Eržen

Compiler Developer @EdgeDB & PRQL Maintainer