Warning message

  • The service having id "twitter" is missing, reactivate its module or save again the list of services.
  • The service having id "facebook" is missing, reactivate its module or save again the list of services.
  • The service having id "google_plus" is missing, reactivate its module or save again the list of services.
  • The service having id "linkedin" is missing, reactivate its module or save again the list of services.

Track: Applied Machine Learning


Day of week:

The world is becoming more intelligent every day. A growing number of appliances and applications are collecting information and sending it back to the mothership to be analyzed, dissected, and fed into model generators (a.k.a. machine learning algorithms). In turn, these model generators send these trained models back to these applications or appliances to modify their behavior. This is the basis of any device with “Smart” in its name. It is also the basis of any web or mobile services that recommend goods and services to you. From smart homes to smart cars, personalized buying to job recommendations, systems that understand speech and video to those that prevent fraud, these systems benefit from applied machine learning. Come to this track to learn about the technologies and practices that power these use-cases.

Track Host:
Vitaly Gordon
Director of Data Science @Salesforce
Vitaly Gordon is a director of data science at Salesforce, where he and his group develop data products from the most diverse and interesting data sets in the world. Prior to Salesforce, Vitaly was a data science lead at LinkedIn, founded the data science team at LivePerson and worked in the elite 8200 unit, leading a team of researchers in developing algorithms to fight terrorism. His contributions have been recognized through a number of awards including the “Life Source” award, an award given each year deemed most high-impact in saving lives. Vitaly holds a B.Sc in Computer Science and an MBA from the Israeli Institute of Technology.
10:35am - 11:25am

by Leah McGuire
PhD-Level Data Scientist @Salesforce focused on using data to build products

80-90% of data science is data cleaning and feature engineering. However, if we were to plot a count of what all the data science tools are for, we would find that most innovation happens in data infrastructure and modeling. We want to change that and make data scientists much more productive while also improving the quality of their work.

In this talk I will describe the machine learning platform we wrote on top of spark to modularize these steps. This allows easy reuse of components...

11:50am - 12:40pm

Open Space
1:40pm - 2:30pm

by Lucian Vlad Lita
Director of Data Engineering @Intuit

In the early days of personalization, the focus was on smarter, more complex machine learning models, on algorithms and optimizations. Later, the attention shifted to feature engineering as a driver for accuracy. Finally, the community focused on data as the next frontier: volume, quality, cleansing, and clean labeling. In this talk, we focus on the crucial next step in personalization: well designed software architectures for storing, computing, and delivering responsive, accurate in-...

2:55pm - 3:45pm

by Dmitry Chechik
Discovery Team Engineer @Pinterest

The Pinterest Homefeed personalizes and ranks 1B+ pins for 100M+ users on Pinterest, using data gathered from collaborative filtering, user curation, web crawl, and many more. This talk will give an overview of the system and focus on effective engineering choices made to enable productive ML development. To have multiple engineers effectively develop, test, and deploy machine-learned models for the Pinterest Homefeed, we’ve built a system that allows for continuous training and feature...

4:10pm - 5:00pm

by Oscar Boykin
Data Scientist @Twitter

Today, tooling for ad-hoc data science is fairly well understood. But when you want to create a repeated process such as analytics or prediction systems, things tend to change with time, and how to deal with such change is not always clear. Columns and features are added and removed. New models are developed. Data errors are discovered and corrected. How can we build a data pipeline system to handle these demands? This talk will discuss some of the systems challenges and solutions that arise...

5:25pm - 6:15pm

by Brian Wilt
Director, Head of Data Science and Engineering @Jawbone

In Jurassic Park, scientists mined dino DNA from mosquitoes trapped in fossilized amber. But before they could use that DNA to clone a dinosaur, there was a catch: the sequences were damaged and incomplete. Ultimately, they needed frog DNA to infer the missing gaps to get a complete DNA sequence to clone dinosaurs.

The Jawbone UP system captures health data through a battery of sensors on your wrist and app on your phone -- your movement, sleep, and heart rate. But this data is often...


Covering innovative topics

Monday Nov 16

Tuesday Nov 17

Wednesday Nov 18