Presentation: Personalization in the Pinterest Homefeed
Key Takeaways
- Understand when you build machine learning into your product you need to make sure you build it into your infrastructure and processes
- Hear data processing and architecture best practices in use at Pinterest
- Learn lessons around ranking, personalizing, and recommending functions at extremely high scale
Abstract
The Pinterest Homefeed personalizes and ranks 1B+ pins for 100M+ users on Pinterest, using data gathered from collaborative filtering, user curation, web crawl, and many more. This talk will give an overview of the system and focus on effective engineering choices made to enable productive ML development. To have multiple engineers effectively develop, test, and deploy machine-learned models for the Pinterest Homefeed, we’ve built a system that allows for continuous training and feature gathering. We will discuss our signal gathering framework, and our system to enable config-based featurization, offline training, and online classification to be driven by a single system. Additionally we will discuss other engineering constraints we’ve built around the system to satisfy business rules and requirements.
Interview with Dmitry Chechik
QCon: Can you give me an idea of the scale you are talking about?
Dmitry: We personalized homefeeds for over 100 million users and provide recommendations for over 1 billion unique items. If you compare that to most other recommendations (like movies which are on the order of 10’s of thousands or songs on the order single’s of millions), we are several orders of magnitude bigger. So part of the scaling challenge is understanding how to build and design a system under those constraints.
QCon: What does your ML stack look like?
Dmitry: There are a number of things we built ourselves and a number of things we are built on top of. In terms of our serving infrastructure, we are built on top of HBase as a storage layer and several java services that do ranking, recommendation, and building feeds.
We have a pretty heavily offline stack with offline hadoop jobs using Cascading, Hive and other technologies to process and build data offline. One of the main things I want to discuss is a case study of a domain specific language used for modeling, machine learning, and feature transformation. That language is the thing that allows us to scale the engineering aspects of the work.
QCon: Is this DSL something that is specific to Pinterest or is it available for other people to use?
Dmitry: Right now, it is something specific to Pinterest, but we may make it available to others. It is a pretty generic system that we can potentially use to serve many different kinds of models such as reading data from many different data sources.
What I’m going to focus on with the DSL is several principles common to any system that attempts to use machine learning and classification online. Things like being able to:
- make experimentation really easy for engineers
- easily push changes online
- consume a variety of datasource
- combine all the work that goes into a model (which is everything from joins, to feature transforms to the actual classifier and weights) as a single package
The focus of the talk will be about using the DSL as a case study about how to take machine learning from being a one-off solution to a repeatable part of your infrastructure.
QCon: What did you mean when you said: “you need to make sure you build machine learning into your infrastructure and processes.”
Dmitry: It’s answering questions like how do you make the process of data collection easy and make that something you can iterate and move forward as your underlying data changes or as you add data types. How do you make the process of building models easy? How do you make it so that your online and offline systems are really 1-to-1, and how do you make sure they are working in tandem? How do you make sure the work you do offline when you are training data transfers to online? Finally, how do you make engineers productive?
Build your systems in a way to make online experimentation easy, to make it easy to get data quickly, and make your online classifier environment be as similar to your offline environment as possible.
QCon: Who do you feel is the main type of person you are talking to in your talk?
Dmitry: I think there are two kinds of people that will benefit from the talk. First is someone who has been very focused on the machine learning aspect and wants to be able to figure out how to scale the system over the long term and make it something that other ML folks can contribute to as well. The other person is someone working on infrastructure for machine learning. So this is someone who is working with a data scientist or machine learning researcher to build out infrastructure and scale a machine learning product. These are some of the pieces that wind up being essential and that everyone has to build out, and it’s worth thinking about them early.
Similar Talks
Tracks
Covering innovative topics
Monday Nov 16
-
Architectures You've Always Wondered About
Silicon Valley to Beijing: Exploring some of the world's most intrigiuing architectures
-
Applied Machine Learning
How to start using machine learning and data science in your environment today. Latest and greatest best practices.
-
Browser as a platform (Realizing HTML5)
Exciting new standards like Service Workers, Push Notifications, and WebRTC are making the browser a formidable platform.
-
Modern Languages in Practice
The rise of 21st century languages: Go, Rust, Swift
-
Org Hacking
Our most innovative companies reimagining the org structure
-
Design Thinking
Level up your approach to problem solving and leave everything better than you found it.
Tuesday Nov 17
-
Containers in Practice
Build resilient, reactive systems one service at a time.
-
Architecting for Failure
Your system will fail. Take control before it takes you with it.
-
Modern CS in the Real World
Real-world Industry adoption of modern CS ideas
-
The Amazing Potential of .NET Open Source
From language design in the open to Rx.NET, there is amazing potential in an Open Source .NET
-
Optimizing You
Keeping life in balance is always a challenge. Learning lifehacks
-
Unlearning Performance Myths
Lessons on the reality of performance, scale, and security
Wednesday Nov 18
-
Streaming Data @ Scale
Real-time insights at Cloud Scale & the technologies that make them happen!
-
Taking Java to the Next Level
Modern, lean Java. Focuses on topics that push Java beyond how you currently think about it.
-
The Dark Side of Security
Lessons from your enemies
-
Taming Distributed Architecture
Reactive architectures, CAP, CRDTs, consensus systems in practice
-
JavaScript Everywhere!
Javascript is Everywhere. Learn why
-
Culture Reimagined
Lessons on building highly effective organizations