Track:

Duration

Duration:

10:35am - 11:25am

Persona:

Data Scientist

Key Takeaways

Build a recommender system that leverages content-based approaches, collaborative filtering, and multi-armed bandits in a simple step-by-step approach.
Learn about the tradeoffs in using different techniques to make recommendations.
Hear practices, tips, and approaches to building a recommendation system.

Abstract

The age of artificial intelligence is upon us. Whether you know it or not, we interact with systems powered by machine learning on a daily basis. If you ever wondered how social networks, online retailers, and video streaming sites seem to know exactly what content and products you desire, this session is for you.

In this talk, we will walk you through the creation of a real-world relevance and recommendation system from scratch. We will cover the machine learning theory powering such systems, focusing on useful hacks and techniques that are not typically covered in standard machine learning courses. Through this crash course on the black art of relevance and recommendation systems, you will be on well track to using artificial intelligence to maximize your product’s user retention, engagement, and conversion rates.

Interview

Question:

QCon: What is the focus of your work today?

Answer:

Clarence: I am a Security Research Engineer at Shape Security. My role is to go through our data and come up with models to identify traffic and automated attacks on large websites and stop them. These are core competency of our company. We gain these competencies in a variety of ways, but machine learning is starting to become more and more of a focus of the company.

Question:

QCon: What’s the motivation for your talk?

Answer:

Clarence: The goal of the talk is to get developers who have a cursory understanding of machine learning to really get their hands dirty and to start doing ML on their own. I think many people have taken online courses on machine learning and have a good idea of what the general concepts are. But ML is a little bit intimidating when you actually have to build something from scratch. These online courses cover high level concepts, but they don’t cover the practical stuff that you need to know. They also skip many of the problems you may face when you actually build these systems. So these pragmatic concepts and practices are a big focus of the talk.

Question:

QCon: What are you going to go through in the talk?

Answer:

Clarence: In general, the premise is that there will be a recommendation problem to solve (this will not be a complex problem). There won’t be too many different features or dimensions to look at, because I want to focus on the problems that developers face in general area (as opposed to specific to a particular data set). I will go through, on a high level, the different kinds of recommendation systems that are possible. For example, highly secured recommendations for simple aggregate to tailor individual profiles for users on the platform and then I will go on to content based items recommendation systems, to generate item and user profiles with collaborative filtering to match them up together. Then, I will go into feature engineering on explicit, implicit, and latent features. These are things that I will slowly add on and with just a few lines of code. In each topic that I will focus on, I will add on to this recommended system that we will build throughout this system in our discussion.

The most useful part (I think) will be some practical hacks, such as how to reliably collect data for utility matrix collaborative filtering, how to extrapolate scores from the known ones if you don’t have scores for many of the items in your dataset, how to measure successful performance (arguably one of the most important in building recommender systems), and also many common things like how to get past the cold start or new entity problems without any users in the system.

Question:

QCon: How would you describe the persona of the target audience of this talk?

Answer:

Clarence: The main target would be some kind of lead developer who is actively developing but also has some kind of decision making powers into how to start a project and how to push a project from beginning to a level of maturity.

Question:

QCon: What should someone know before coming to your talk?

Answer:

Clarence: I think that those coming to the talk would be interested in some kind of recommender systems and have a potential use case for it. They should know some basic terms in the space like ‘what is a recommender system?’ I will go through terms on a high level but it’s hard to compare all the different ways of solving a particular problem. Recommender systems are just one way of solving the general recommendation problems. So if they were to know in general what recommender systems are meant to do, that would be great.

Question:

QCon: QCon targets advanced architects and senior development leads, what do you feel will be actionable for that persona when they leave your talk?

Answer:

Clarence: I want them to walk away with a sense that while there are a lot of hidden problems when it comes to implementing recommender systems, these problems are actually pretty approachable if you know how to work around them. There are a lot of practical hacks around. For example, the "cold start" problem: not many people know a general answer to, and there are some general best practices that are not that easy to find online.

I think they might be buried in books somewhere, especially machine learning text books. It is generally hard for someone who is unfamiliar with the topic to start building these systems and know where to look for patterns.

Question:

QCon: Can you give me an example of one of the ones that I might not know, like the cold start problem or something?

Answer:

Clarence: I think the cold start problem is the most general one that people don’t have a good answer to or think that it depends a lot on the context of the problem. For example, if I am building a recommender system for online videos, and you don’t have a good idea of what users might like a particular video. But if you use user profiling to extrapolate the existing knowledge that you have, and use an external datasource to perform extrapolation on the existing users. By performing similarity matching on another dimension that is not based on video content or anything to do with the items that you are recommending, you are able to draw auxiliary decisions and make a different comparison on a dimension that you are not making a recommendation on.

Speaker: Clarence Chio

Security Research Engineer @ShapeSecurity

Clarence Chio graduated with a B.S. and M.S. in Computer Science from Stanford, specializing in data mining and artificial intelligence. He currently works as a Security Research Engineer at Shape Security, building a product that protects high valued web assets from automated attacks. At Shape, he works on the data analysis systems used to tackle this problem. Clarence spoke on Machine Learning and Security at DEF CON 24, GeekPwn Shanghai, PHDays Moscow, BSides Las Vegas and NYC, Code Blue Tokyo, SecTor Toronto, and Hack in Paris (2015-2016). He had been a community speaker with Intel, and is also the founder and organizer of the ‘Data Mining for Cyber Security’ meetup group, the largest gathering of security data scientists in the San Francisco Bay Area.

Find Clarence Chio at

Speaker page

@cchio

IBM Distinguished Engineer

Mark Vanderwiele

Stranger Things: The Forces that Disrupt Netflix

Senior Software Engineer, Playback Features @Netflix

Haley Tucker

99.99% Availability via Smart Real-Time Alerting

Data Science Manager @Uber

Franziska Bell

Creating A Culture of Observability at Stripe

Observability Specialist @Stripe

Cory Watson

Migrating to a Fault Tolerant System with Spanner

Software Engineer @Google

Edwin Fuquen

Freeing the Whale: How to Fail at Scale

CTO @Buoyant

Oliver Gould

Automating Chaos Experiments In Production

Senior Software Engineer @Netflix

Ali Basiri

Architecting for Failure in a Containerized World

Principle Data Analysis Leader @Infolace

Tom Faulhaber

Further Together: Curated Pairing Culture @Pivotal

Software Engineer @Pivotal

Neha Batra

Tracks

Monday Nov 7

Architectures You've Always Wondered About

You know the names. Now learn lessons from their architectures
Distributed Systems War Stories

“A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable.” - Lamport.
Containers Everywhere

State of the art in Container deployment, management, scheduling
Art of Relevancy and Recommendations

Lessons on the adoption of practical, real-world machine learning practices. AI & Deep learning explored.
Next Generation Web Standards, Frameworks, and Techniques

JavaScript, HTML5, WASM, and more... innovations targetting the browser
Optimize You

Keeping life in balance is a challenge. Learn lifehacks, tips, & techniques for success.

Tuesday Nov 8

Next Generation Microservices

What will microservices look like in 3 years? What if we could start over?
Java: Are You Ready for This?

Real world lessons & prepping for JDK9. Reactive code in Java today, Performance/Optimization, Where Unsafe is heading, & JVM compile interface.
Big Data Meets the Cloud

Overviews and lessons learned from companies that have implemented their Big Data use-cases in the Cloud
Evolving DevOps

Lessons/stories on optimizing the deployment pipeline
Software Engineering Softskills

Great engineers do more than code. Learn their secrets and level up.
Modern CS in the Real World

Applied, practical, & real-world dive into industry adoption of modern CS ideas

Wednesday Nov 9

Architecting for Failure

Your system will fail. Take control before it takes you with it.
Stream Processing

Stream Processing, Near-Real Time Processing
Bare Metal Performance

Native languages, kernel bypass, tooling - make the most of your hardware
Culture as a Differentiator

The why and how for building successful engineering cultures
//TODO: Security <-- fix this

Building security from the start. Stories, lessons, and innovations advancing the field of software security.
UX Reimagined

Bots, virtual reality, voice, and new thought processes around design. The track explores the current art of the possible in UX and lessons from early adoption.

SCHEDULE

Duration

Persona:

Key Takeaways

Abstract

Interview

Find Clarence Chio at

Similar Talks

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World

Presentation: The Art of Relevance and Recommendations

Duration

Persona:

More talks on:

Key Takeaways

Abstract

Interview

Find Clarence Chio at

Similar Talks

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers

Follow QCon

Contact

Menu

QCons around the World