Track:

Duration

Duration:

2:55pm - 3:45pm

Level:

Intermediate

Persona:

Architect
Developer
Developer, JVM

Key Takeaways

Understand the architecture, concerns, and challenges building a high volume real-time streaming solution like Trending Now leveraging Spark Streaming.
Hear practical advice implementing a real-time streaming service, such as joining multiple live streams of data, data persistence patterns, and resiliency considerations.
Learn some of the strengths and weakness Netflix uncovered using Spark Streaming to better understand whether it fits your individual use case.

Abstract

Recommendations play a vital role in a great Netflix experience. Traditionally, these recommendations are precomputed using viewing history, scroll activity, and a variety of other signals in a near-line fashion. To be able to react more quickly to surges and dips in interest, we introduced the Trending Now row that makes use of real time data as an additional signal for generating recommendations. This allows us to not only personalize this row based on the context like time of day and day of week, but also react to sudden changes in collective interests of members, due to real-world events.

We will discuss the data pipeline that we built to process Netflix user activities in real time for the Trending Now row. We will share our experiences with developing, monitoring, and productionalizing a system that uses Kafka, Spark Streaming, and Cassandra.

Interview

Question:

QCon: What is your role today?

Answer:

Elliot: I am a software engineer at Netflix on the Personalization Infrastructure team. We develop data pipelines and other infrastructure to help the personalization algorithms team make recommendations.

Question:

QCon: What is the scale of recommendations at Netflix? It must be huge.

Answer:

Elliot: Definitely. With over 70 million of subscribers, there is a lot of traffic every day. In addition to streaming content, we also collect telemetry about how our users are engaging with Netflix. Since we want to generate the best recommendations and user experience, we have to consider all of that data.

Question:

QCon: How much of that data is used for real time? Is it just a small subset or massive amounts of data?

Answer:

Elliot: A significant part of the real time data we process is made up of what people browse through on Netflix and what people watch, and people are watching Netflix at all hours during the day. All of this data comes in through Kafka, so everything is fairly real time. There obviously is a lot of data cleaning that needs to be done, and depending on the algorithms that consume the data, we have to adjust our filtering criteria accordingly.

Question:

QCon: How do you describe the persona of the target audience of this talk?

Answer:

Elliot: Someone who is considering using Spark Streaming or using it more extensively. They would also be involved in implementation.

Question:

QCon: What’s the motivation for your talk?

Answer:

Elliot: Trending Now is one of the first use cases at Netflix where we try to make use of real time data for personalization. We are also one of the first at Netflix to be using Spark Streaming. There have been many questions about how well this will work, especially in a production environment at Netflix's scale. Our team had looked into Spark Streaming maybe a year or two ago and it wasn’t quite mature enough at the time. Many improvements have been made since then, which has prompted us to revisit it. We wanted to share our learnings as we build this system and continue to improve it.

Question:

QCon: Can you give an example? When you say share your learnings, can you tell me about one of the stories, one of the lessons you’ll tell?

Answer:

Elliot: A couple things I plan to discuss are challenges we faced around joining multiple input streams and some Spark Streaming quirks. We have also done quite a bit on the operations side, including things like failure handling, calculating metrics, and monitoring. These are critical pieces that make the system resilient and production ready.

Question:

QCon: How deep do you get into the weeds with the implementation details?

Answer:

Elliot: A bit of code will probably come up in a few places about Spark Streaming specifically, but that should only be a small part of the discussion. I'll focus on different patterns we tried.

Question:

QCon: Can you give me an example of something actionable that a senior dev/lead might gain from coming to you talk?

Answer:

Elliot: They should come away with an idea about how we used the technologies we chose for building a system like this, and whether they would be suitable for their use cases. Hopefully, they will also find some of our experiences with Spark Streaming useful.

Speaker: Elliot Chow

Senior Software Engineer @Netflix

Elliot is a software engineer at Netflix on the Personalization Infrastructure team. He graduated from UC Berkeley (B.S.) and Stanford (M.S.) and has previously worked at eBay and Apple. Currently, he builds big data systems using a variety of technologies including Scala, Spark (Streaming), Kafka, and Cassandra.

Find Elliot Chow at

Speaker page

Software Engineer @Netflix

IBM Distinguished Engineer

Mark Vanderwiele

Stranger Things: The Forces that Disrupt Netflix

Senior Software Engineer, Playback Features @Netflix

Haley Tucker

99.99% Availability via Smart Real-Time Alerting

Data Science Manager @Uber

Franziska Bell

Creating A Culture of Observability at Stripe

Observability Specialist @Stripe

Cory Watson

Migrating to a Fault Tolerant System with Spanner

Software Engineer @Google

Edwin Fuquen

Freeing the Whale: How to Fail at Scale

CTO @Buoyant

Oliver Gould

Automating Chaos Experiments In Production

Senior Software Engineer @Netflix

Ali Basiri

Architecting for Failure in a Containerized World

Principle Data Analysis Leader @Infolace

Tom Faulhaber

Recommendations, Bots & Conversational Interfaces

Worked on recommender systems such as People You May Know @LinkedIn

Mitul Tiwari

Tracks

Monday Nov 7

Architectures You've Always Wondered About

You know the names. Now learn lessons from their architectures
Distributed Systems War Stories

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Lamport.
Containers Everywhere

State of the art in Container deployment, management, scheduling
Art of Relevancy and Recommendations

Lessons on the adoption of practical, real-world machine learning practices. AI & Deep learning explored.
Next Generation Web Standards, Frameworks, and Techniques

JavaScript, HTML5, WASM, and more... innovations targetting the browser
Optimize You

Keeping life in balance is a challenge. Learn lifehacks, tips, & techniques for success.

Tuesday Nov 8

Next Generation Microservices

What will microservices look like in 3 years? What if we could start over?
Java: Are You Ready for This?

Real world lessons & prepping for JDK9. Reactive code in Java today, Performance/Optimization, Where Unsafe is heading, & JVM compile interface.
Big Data Meets the Cloud

Overviews and lessons learned from companies that have implemented their Big Data use-cases in the Cloud
Evolving DevOps

Lessons/stories on optimizing the deployment pipeline
Software Engineering Softskills

Great engineers do more than code. Learn their secrets and level up.
Modern CS in the Real World

Applied, practical, & real-world dive into industry adoption of modern CS ideas

Wednesday Nov 9

Architecting for Failure

Your system will fail. Take control before it takes you with it.
Stream Processing

Stream Processing, Near-Real Time Processing
Bare Metal Performance

Native languages, kernel bypass, tooling - make the most of your hardware
Culture as a Differentiator

The why and how for building successful engineering cultures
//TODO: Security <-- fix this

Building security from the start. Stories, lessons, and innovations advancing the field of software security.
UX Reimagined

Bots, virtual reality, voice, and new thought processes around design. The track explores the current art of the possible in UX and lessons from early adoption.

SCHEDULE

Duration

Level:

Persona:

Key Takeaways

Abstract

Interview

Find Elliot Chow at

Similar Talks

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World

Presentation: Real Time Recommendations using Spark Streaming

Duration

Level:

Persona:

More talks on:

Key Takeaways

Abstract

Interview

Find Elliot Chow at

Similar Talks

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World