Patterns of Streaming Applications

Stream processing engines are becoming pivotal in analyzing data. They have evolved beyond a data transport and simple processing machinery, to one that's capable of complex processing. The necessary features and building blocks of these engines are well known. And most capable engines have a...

Monal Daxini Distributed Systems Engineer / Leader @Netflix
Human-Centric Machine Learning Infrastructure @Netflix

Netflix has over 100 data scientists applying machine learning to a wide range of business problems from title popularity predictions to quality of streaming optimizations. Our unique culture gives data scientists plenty of freedom to choose the modeling approach, libraries, and even the...

Ville Tuulos Machine Learning Infrastructure Engineer @Netflix
Training Deep Learning Models at Scale on Kubernetes

Deep Learning has recently become very important for all kinds of AI applications from conversational chatbots to self-driving cars. In this talk, we will talk about how we use deep learning for natural language processing, utilize Tensorflow for training deep learning models, run Tensorflow on...

Deepak Bobbarjung Founding Engineer @PassageAI
Mitul Tiwari CTO @PassageAI
Massively scaling MySQL using Vitess

Are you dealing with the challenges of rapid growth? Are you thinking about how to scale your database layer? Should you use NoSQL? Should you shard your relational database? If you are facing these kinds of problems, this session is for you. Vitess is a database solution for deploying, scaling...

Sugu Sougoumarane Co-Founder / CTO @planetscaledata & Co-Creator @vitessio
The Whys and Hows of Database Streaming

Batch-style ETL pipelines have been the de facto method for getting data from OLTP to OLAP database systems for a long time. At WePay, when we first built our data pipeline from MySQL to BigQuery, we adopted this tried-and-true approach. However, as our company scaled and our business needs grew,...

Joy Gao Sr. Software Engineer @WePay
Custom, Complex Windows @Scale Using Apache Flink

100 Million members in over 190 countries leads to more than 1 Trillion events and 3 PB of data flowing through Netflix’s real-time data infrastructure each day. We’ve built a data pipeline in the cloud that reliably collects and routes these events to a variety of sinks. The data in...

What's the focus of your work?

Recently, I’ve primarily been building data platforms. That is, platforms to enable Data and Software Engineers to collect and process data.

Serhat Yilmaz Software Engineer @Facebook

Data Decisions With Realtime Stream Processing

QCon: What's the focus of your work and of the team that you're on at Facebook?

Rajesh: My team is working on stream processing, and we are part of the real-time data organization which focuses on faster, simpler, and smarter delivery of data. We want to reduce the time to results for people and our data driven products and people wait on that rely on data driven. Our organization encompasses the stream...

Can you give an example of some of the questions you get from data scientists when you are trying to deploy models?

When it comes to common questions, as boring as it may sound, my experience is that machine learning infrastructure is much more about data than science. Most questions we get are related to data: how do I find the data I need, how do I set up the data pipeline, how do I handle the somewhat non-trivial amounts of data in python and R,...

