Track: Stream Processing In The Modern Age

Location: Bayview AB

Day of week: Tuesday

Stream processing pipelines have become essential to building engaging experiences on the web today. Whether you enjoy personalized news feeds on LinkedIn and Facebook, profit from near real time updates to search engines and recommender systems, or benefit from near-realtime fraud detection on a lost or stolen credit card, you have come to rely on the fruits of stream processing as an end user. As a ops-focused engineer, you may employ stream processing to understand complex call trees in your microservice-based infrastructure with the aim to eliminate redundant system load or improve mobile and web application performance. As an business analyst, you may want to execute a SQL query on a live stream of data to reveal some insights. Come learn about interesting applications of stream processing as well as recent advances in the field.

Track Host:
Tyler Akidau
Engineer @Google & Founder/Committer on Apache Beam

Tyler Akidau is a senior staff software engineer at Google Seattle. He leads technical infrastructure’s internal data processing teams in Seattle (MillWheel & Flume), is a founding member of the Apache Beam PMC, and has spent the last seven years working on massive-scale data processing systems. Though deeply passionate and vocal about the capabilities and importance of stream processing, he is also a firm believer in batch and streaming as two sides of the same coin, with the real endgame for data processing systems the seamless merging between the two. He is the author of the 2015 Dataflow Model paper, the Streaming 101 and Streaming 102 articles, and the upcoming Streaming Systems book. His preferred mode of transportation is by cargo bike, with his two young daughters in tow.

10:35am - 11:25am

by Matt Zimmer
Real-time Data Infrastructure Senior Engineer @Netflix

100 Million members in over 190 countries leads to more than 1 Trillion events and 3 PB of data flowing through Netflix’s real-time data infrastructure each day. We’ve built a data pipeline in the cloud that reliably collects and routes these events to a variety of sinks. The data in these events are are used in several ways; from personalizing the customer experience to business intelligence.

The windowing capabilities offered by most stream processing engines are limited to aligned...

11:50am - 12:40pm

by Serhat Yilmaz
Software Engineer @Facebook

At Facebook, we can move fast and iterate because of our ability to make data-driven decisions. Data from our stream processing systems provide real-time data analytics and insights; the system is also implemented into various Facebook products, which have to aggregate data from many sources. In this talk, we cover:

  1. the difficulties of stream processing at scale
  2. the solutions we've created to date
  3. three case studies on improving the time to deliver insights with...
1:40pm - 2:30pm

by Vasia Kalavri
PMC Member of Apache Flink (Core Developer Graph Processing API) & Postdoctoral researcher at the ETH Zurich Systems group

A modern enterprise datacenter is a complex, multi-layered system whose components often interact in unpredictable ways. Yet, to keep operational costs low and maximize efficiency, we would like to foresee the impact of changing workloads, updating configurations, modifying policies, or deploying new services.

In this talk, I will share our research group’s ongoing work on Strymon: a system for predicting datacenter behavior in hypothetical scenarios using queryable online simulation...

2:55pm - 3:45pm

by Stephan Ewen
Committer @ApacheFlink, CTO @dataArtisans

Come learn how Apache Flink is handles stateful stream processing and how to manage distributed stream processing and data driven applications efficiently with Flink's checkpoints and savepoints.

Over the last years, data stream processing has redefined how many of us build data pipelines. Apache Flink is one of the systems at the forefront of that development: With its versatile APIs (event-time streaming, Stream SQL, events/state) and powerful execution model, Flink has been part of...

4:10pm - 5:00pm

by Tyler Akidau
Engineer @Google & Founder/Committer on Apache Beam

What does it mean to execute robust streaming queries in SQL? What is the relationship of streaming queries to classic relational queries? Are streams and tables the same thing conceptually, or different? And how does all of this relate to the programmatic frameworks like we’re all familiar with? This talk will address all of those questions in two parts.

First, we’ll explore the relationship between the Beam Model (as...

5:25pm - 6:15pm

by Julian Hyde
Original Developer @ApacheCalcite, Co-Founder SQLstream, & Architect @Hortonworks

by Tyler Akidau
Engineer @Google & Founder/Committer on Apache Beam

by Jay Kreps
Co-Founder and CEO @Confluent

by Michael Armbrust
Initial Author of Apache Spark SQL & Leads Streaming Team @Databricks

by Stephan Ewen
Committer @ApacheFlink, CTO @dataArtisans

Queries over streams are generally "continuous," executing for long periods of time and returning incremental results. Yet operations over streams must have the ability to be monotonic. New Generation of Stream Processing Engines has added support for Stream SQL. This AMA / panel features a discussion with thought leaders evolving and shaping the space.

.

Tracks

  • Architectures You've Always Wondered About

    Architectural practices from the world's most well-known properties, featuring startups, massive scale, evolving architectures, and software tools used by nearly all of us.

  • Going Serverless

    Learn about the state of Serverless & how to successfully leverage it! Lessons learned in the track hit on security, scalability, IoT, and offer warnings to watch out for.

  • Microservices: Patterns and Practices

    Stories of success and failure building modern Microservices, including event sourcing, reactive, decomposition, & more.

  • DevOps: You Build It, You Run It

    Pushing DevOps beyond adoption into cultural change. Hear about designing resilience, managing alerting, CI/CD lessons, & security. Features lessons from open source, Linkedin, Netflix, Financial Times, & more. 

  • The Art of Chaos Engineering

    Failure is going to happen - Are you ready? Chaos engineering is an emerging discipline - What is the state of the art?

  • The Whole Engineer

    Success as an engineer is more than writing code. Hear inward looking thoughts on inclusion, attitude, leadership, remote working, and not becoming the brilliant jerk.

  • Evolving Java

    Java continues to evolve & change. Track covers Spring 5, async, Kotlin, serverless, the 6-month cadence plans, & AI/ML use cases.

  • Security: Attacking and Defending

    Offense and defensive security evolution that application developers should know about including SGX Enclaves, effects of AI, software exploitation techniques, & crowd defense

  • The Practice & Frontiers of AI

    Learn about machine learning in practice and on the horizon. Learn about ML at Quora, Uber's Michelangelo, ML workflow with Netflix Meson and topics on Bots, Conversational interfaces, automation, and deployment practices in the space.

  • 21st Century Languages

    Compile to Native, Microservices, Machine learning... tailor-made languages solving modern challenges, featuring use cases around Go, Rust, C#, and Elm.

  • Modern CS in the Real World

    Applied trends in Computer Science that are likely to affect Software Engineers today. Topics include category theory, crypto, CRDT's, logic-based automated reasoning, and more.

  • Stream Processing In The Modern Age

    Compelling applications of stream processing using Flink, Beam, Spark, Strymon & recent advances in the field, including Custom Windowing, Stateful Streaming, SQL over Streams.