You are viewing content from a past/completed QCon

Track: Applied AI & Machine Learning

Location: Pacific LMNO

Day of week: Wednesday

Machine learning will soon have a profound effect on every industry on the planet. This revitalized wave generates huge demands and challenges for software developers with this expertise. The Practical Machine Learning track will focus on how developers can successfully build real world machine learning models based on the proven techniques using viable APIs and frameworks. Since its critical to using machine learning in our applications, we’ll also cover the best practices for collecting and preprocessing data, choosing and building models; these are some of the biggest challenges in putting machine learning in production.

Track Host: Sid Anand

Chief Data Engineer @PayPal

Sid Anand currently serves as PayPal's Chief Data Engineer, focusing on ways to realize the value of data. Prior to joining PayPal, he held several positions including Agari's Data Architect, a Technical Lead in Search @ LinkedIn, Netflix’s Cloud Data Architect, Etsy’s VP of Engineering, and several technical roles at eBay. Sid earned his BS and MS degrees in CS from Cornell University, where he focused on Distributed Systems. In his spare time, he is a maintainer/committer on Apache Airflow, a co-chair for QCon, and a frequent speaker at conferences. When not working, Sid spends time with his wife, Shalini, and their 2 kids.

10:35am - 11:25am

Human-Centric Machine Learning Infrastructure @Netflix

Netflix has over 100 data scientists applying machine learning to a wide range of business problems from title popularity predictions to quality of streaming optimizations. Our unique culture gives data scientists plenty of freedom to choose the modeling approach, libraries, and even the programming language that will make them productive at solving problems. However, we want to balance this freedom by providing a solid infrastructure for machine learning, ensuring models can be promoted quickly and reliably from prototype to production, and enabling reproducible and easily shareable results.

We started building this infrastructure a little over a year ago with a human-centric mindset. Many existing open-source machine learning frameworks are great at making advanced modeling possible. The job of our ML infrastructure is to make it remarkably easy to apply these frameworks to real business problems at Netflix. We have found that this requires an infrastructure that covers the day-to-day challenges of data scientists holistically, from understanding input data to building trust with consumers of models, not just the parts that are directly related to fitting and scoring models.

Come learn the techniques and underlying principles driving our approach, which you'll be able to adapt and apply to your own use cases.

Ville Tuulos, Machine Learning Infrastructure Engineer @Netflix

11:50am - 12:40pm

Deep Representation: Building a Semantic Image Search Engine

Many problems combine Natural Language Processing and Computer Vision.  Sharing his experience of having led over a hundred applied AI projects at Insight, Emmanuel will give a step by step tutorial on how to build a semantic search engine for text and images, with code included! The approaches presented extend naturally to other applications such as image and video captioning, reading text from videos, selecting optimal thumbnails and generating code from sketches of websites (all projects that were tackled at Insight), and more!

Emmanuel Ameisen, Head of AI @InsightDataSci

1:40pm - 2:30pm

Nearline Recommendations for Active Communities @LinkedIn

At LinkedIn, our mission is to use AI to connect every member of the global workforce to make them more productive and successful. The social network is the backbone for professionals to engage with each other at every stage of their career. In the first half of this talk, I will focus on technologies we have built to power LinkedIn’s “People You May Know” product, the primary driver to connect the world’s professionals to each other to form a basic community. Our platform allows for triangle closing and other graph walk algorithms in real time. It also allows models to consider near real-time features based on a user’s context. We will demonstrate improvements through AB tests. We will then move on to discuss work done in predicting the downstream impact of forming an edge between two members on the overall activity of our ecosystem. We will show that how a member’s network evolves plays an important role in their downstream engagement. Finally, we will present our work on near real-time optimization of activity-based notifications that ensure that our members never miss a conversation that matters. We will describe our nearline platform for notification recommendation and show through experiments that delivering the right information to the right user (through better content targeting) at the right time (through delivery time optimization and message spacing) is critical to building an actively engaged community.

Hema Raghavan, Senior Manager & Heading AI for Growth and Communication Relevance @LinkedIn

2:55pm - 3:45pm

Fairness, Transparency, and Privacy in AI @LinkedIn

How do we protect privacy of users in large-scale systems? How do we ensure fairness and transparency when developing machine learned models? With the ongoing explosive growth of AI/ML models and systems, these are some of the ethical and legal challenges encountered by researchers and practitioners alike. In this talk, we will first present an overview of privacy breaches as well as algorithmic bias / discrimination issues observed in the Internet industry over the last few years and the lessons learned, key regulations and laws, and evolution of techniques for achieving privacy and fairness in data-driven systems. We will motivate the need for adopting a "privacy and fairness by design" approach when developing data-driven AI/ML models and systems for different consumer and enterprise applications. We will also focus on the application of privacy-preserving data mining and fairness-aware machine learning techniques in practice, by presenting case studies spanning different LinkedIn applications, and conclude with the key takeaways and open challenges.

Krishnaram Kenthapadi, Tech Lead Fairness, Transparency, Explainability & Privacy Efforts @LinkedIn

4:10pm - 5:00pm

Jupyter Notebooks: Interactive Visualization Approaches

Jupyter Notebooks are becoming the IDE of choice for data scientists and researchers. They provide the users with a nice exploratory environment where they can quickly research and prototype different models and visualize the results all in one place. Notebooks are easy to share and can be converted into documents/slides to present to stakeholders. 

With widget libraries like ipywidgets and bqplot, users can create rich applications, dashboards and tools by just using python code.

In this talk, we will see how we can build interactive visualizations in the Jupyter notebook. In the first part of the talk, I'll introduce the widget libraries and walk you through the code of a simple example so we understand how to assemble and link these widgets. Then we'll look at usecases including building dashboards from server logs, twitter sentiment analysis and finally tools for building, training and diagnosing deep learning models.

Chakri Cherukuri, Senior Researcher in the Quantitative Financial Research Group @Bloomberg


  • Modern Operating Systems

    Applied, practical & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on.

  • Software Supply Chain

    Securing the container image supply chain (containers + orchestration + security + DevOps).

  • Modern CS in the Real World

    Thoughts pushing software forward, including consensus, CRDT's, formal methods & probabilistic programming.

  • Tech Ethics: The Intersection of Human Welfare & STEM

    What does it mean to be ethical in software? Hear how the discussion is evolving and what is being said in ethics.

  • Optimizing Yourself: Human Skills for Individuals

    Better teams start with a better self. Learn practical skills for IC.

  • Modern Data Architectures

    Today’s systems move huge volumes of data. Hear how places like LinkedIn, Facebook, Uber and more built their systems and learn from their mistakes.

  • Practices of DevOps & Lean Thinking

    Practical approaches using DevOps and a lean approach to delivering software.

  • Operationalizing Microservices: Design, Deliver, Operate

    What's the last mile for deploying your service? Learn techniques from the world's most innovative shops on managing and operating Microservices at scale.

  • Bare Knuckle Performance

    Killing latency and getting the most out of your hardware

  • Architectures You've Always Wondered About

    Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more

  • Machine Learning for Developers

    AI/ML is more approachable than ever. Discover how deep learning and ML is being used in practice. Topics include: TensorFlow, TPUs, Keras, PyTorch & more. No PhD required.

  • Production Readiness: Building Resilient Systems

    Making systems resilient involves people and tech. Learn about strategies being used from chaos testing to distributed systems clustering.

  • Surviving Uncertainty: Regulation, Risk, and Compliance

    With so much uncertainty, how do you bulkhead your organization and technology choices? Learn strategies for dealing with uncertainty.

  • Languages of Infra

    This track explores languages being used to code the infrastructure. Expect practices on toolkits and languages like Cloudformation, Terraform, Python, Go, Rust, Erlang.

  • Building & Scaling High-Performing Teams

    Building, maintaining, and growing a team balanced for skills and aptitudes. Constraint theory, systems thinking, lean, hiring/firing and performance improvement

  • Evolving the JVM

    The JVM continues to evolve. We’ll look at how languages like Kotlin, Graal, Clojure, and Java are evolving the JDK. Expect polyglot, multi-VM, performance, and more in this track.

  • Trust, Safety & Security

    Privacy, confidentiality, safety and security: learning from the frontlines.

  • JavaScript & Transpiler/WebAssembly Track

    JavaScript is the language of the web. Latest practices for JavaScript development in and how transpilers are affecting the way we work. We’ll also look at the work being done with WebAssembly.