Track: Stream Processing In The Modern Age

Location: Bayview AB

Day of week: Tuesday

Stream processing pipelines have become essential to building engaging experiences on the web today. Whether you enjoy personalized news feeds on LinkedIn and Facebook, profit from near real time updates to search engines and recommender systems, or benefit from near-realtime fraud detection on a lost or stolen credit card, you have come to rely on the fruits of stream processing as an end user. As a ops-focused engineer, you may employ stream processing to understand complex call trees in your microservice-based infrastructure with the aim to eliminate redundant system load or improve mobile and web application performance. As an business analyst, you may want to execute a SQL query on a live stream of data to reveal some insights. Come learn about interesting applications of stream processing as well as recent advances in the field.

Track Host:
Tyler Akidau
Engineer @Google & Founder/Committer on Apache Beam

Tyler Akidau is a senior staff software engineer at Google Seattle. He leads technical infrastructure’s internal data processing teams in Seattle (MillWheel & Flume), is a founding member of the Apache Beam PMC, and has spent the last seven years working on massive-scale data processing systems. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper, the Streaming 101 and Streaming 102 articles, and the upcoming Streaming Systems book. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.

10:35am - 11:25am

by Matt Zimmer
Real-time Data Infrastructure Senior Engineer @Netflix

100 Million members in over 190 countries leads to more than 1 Trillion events and 3 PB of data flowing through Netflix’s real-time data infrastructure each day. We’ve built a data pipeline in the cloud that reliably collects and routes these events to a variety of sinks. The data in these events are are used in several ways; from personalizing the customer experience to business intelligence.

The windowing capabilities offered by most stream processing engines are limited to aligned...

11:50am - 12:40pm

Open Space
1:40pm - 2:30pm

by Vasia Kalavri
PMC Member of Apache Flink (Core Developer Graph Processing API) & Postdoctoral researcher at the ETH Zurich Systems group

A modern enterprise datacenter is a complex, multi-layered system whose components often interact in unpredictable ways. Yet, to keep operational costs low and maximize efficiency, we would like to foresee the impact of changing workloads, updating configurations, modifying policies, or deploying new services.

In this talk, I will share our research group’s ongoing work on Strymon: a system for predicting datacenter behavior in hypothetical scenarios using queryable online simulation...

2:55pm - 3:45pm

by Tyler Akidau
Engineer @Google & Founder/Committer on Apache Beam

What does it mean to execute robust streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing conceptually, or different? And how does all of this relate to the programmatic frameworks like we’re all familiar with? This talk will address all of those questions in two parts.

First, we’ll explore the relationship between the Beam Model (as...

4:10pm - 5:00pm

Abstract Coming Soon

5:25pm - 6:15pm

by Rajesh Nishtala
Realtime Data Engineer @Facebook

At Facebook, we can move fast and iterate because of our ability to make data-driven decisions. Data from our stream processing systems provide real-time data analytics and insights; the system is also implemented into various Facebook products, which have to aggregate data from many sources. In this talk, we cover (1) the difficulties of stream processing at scale, (2) the solutions we've created to date, and (3) three case studies on improving the time to...