Machine learning will soon have a profound effect on every industry on the planet. This revitalized wave generates huge demands and challenges for software developers with this expertise. The Practical Machine Learning track will focus on how developers can successfully build real world machine learning models based on the proven techniques using viable APIs and frameworks. Since its critical to using machine learning in our applications, we’ll also cover the best practices for collecting and preprocessing data, choosing and building models; these are some of the biggest challenges in putting machine learning in production.
Track: Applied AI & Machine Learning
Location: Pacific LMNO
Day of week: Wednesday
Track Host: Sid Anand
Sid Anand currently serves as PayPal's Chief Data Engineer, focusing on ways to realize the value of data. Prior to joining PayPal, he held several positions including Agari's Data Architect, a Technical Lead in Search @ LinkedIn, Netflix’s Cloud Data Architect, Etsy’s VP of Engineering, and several technical roles at eBay. Sid earned his BS and MS degrees in CS from Cornell University, where he focused on Distributed Systems. In his spare time, he is a maintainer/committer on Apache Airflow, a co-chair for QCon, and a frequent speaker at conferences. When not working, Sid spends time with his wife, Shalini, and their 2 kids.
10:35am - 11:25am
Human-Centric Machine Learning Infrastructure @Netflix
Netflix has over 100 data scientists applying machine learning to a wide range of business problems from title popularity predictions to quality of streaming optimizations. Our unique culture gives data scientists plenty of freedom to choose the modeling approach, libraries, and even the programming language that will make them productive at solving problems. However, we want to balance this freedom by providing a solid infrastructure for machine learning, ensuring models can be promoted quickly and reliably from prototype to production, and enabling reproducible and easily shareable results.
We started building this infrastructure a little over a year ago with a human-centric mindset. Many existing open-source machine learning frameworks are great at making advanced modeling possible. The job of our ML infrastructure is to make it remarkably easy to apply these frameworks to real business problems at Netflix. We have found that this requires an infrastructure that covers the day-to-day challenges of data scientists holistically, from understanding input data to building trust with consumers of models, not just the parts that are directly related to fitting and scoring models.
Come learn the techniques and underlying principles driving our approach, which you'll be able to adapt and apply to your own use cases.
11:50am - 12:40pm
Deep Representation: Building a Semantic Image Search Engine
Many problems combine Natural Language Processing and Computer Vision. Sharing his experience of having led over a hundred applied AI projects at Insight, Emmanuel will give a step by step tutorial on how to build a semantic search engine for text and images, with code included! The approaches presented extend naturally to other applications such as image and video captioning, reading text from videos, selecting optimal thumbnails and generating code from sketches of websites (all projects that were tackled at Insight), and more!
1:40pm - 2:30pm
Nearline Recommendations for Active Communities @LinkedIn
At LinkedIn, our mission is to use AI to connect every member of the global workforce to make them more productive and successful. The social network is the backbone for professionals to engage with each other at every stage of their career. In the first half of this talk, I will focus on technologies we have built to power LinkedIn’s “People You May Know” product, the primary driver to connect the world’s professionals to each other to form a basic community. Our platform allows for triangle closing and other graph walk algorithms in real time. It also allows models to consider near real-time features based on a user’s context. We will demonstrate improvements through AB tests. We will then move on to discuss work done in predicting the downstream impact of forming an edge between two members on the overall activity of our ecosystem. We will show that how a member’s network evolves plays an important role in their downstream engagement. Finally, we will present our work on near real-time optimization of activity-based notifications that ensure that our members never miss a conversation that matters. We will describe our nearline platform for notification recommendation and show through experiments that delivering the right information to the right user (through better content targeting) at the right time (through delivery time optimization and message spacing) is critical to building an actively engaged community.
2:55pm - 3:45pm
Fairness, Transparency, and Privacy in AI @LinkedIn
How do we protect privacy of users in large-scale systems? How do we ensure fairness and transparency when developing machine learned models? With the ongoing explosive growth of AI/ML models and systems, these are some of the ethical and legal challenges encountered by researchers and practitioners alike. In this talk, we will first present an overview of privacy breaches as well as algorithmic bias / discrimination issues observed in the Internet industry over the last few years and the lessons learned, key regulations and laws, and evolution of techniques for achieving privacy and fairness in data-driven systems. We will motivate the need for adopting a "privacy and fairness by design" approach when developing data-driven AI/ML models and systems for different consumer and enterprise applications. We will also focus on the application of privacy-preserving data mining and fairness-aware machine learning techniques in practice, by presenting case studies spanning different LinkedIn applications, and conclude with the key takeaways and open challenges.
4:10pm - 5:00pm
Jupyter Notebooks: Interactive Visualization Approaches
Jupyter Notebooks are becoming the IDE of choice for data scientists and researchers. They provide the users with a nice exploratory environment where they can quickly research and prototype different models and visualize the results all in one place. Notebooks are easy to share and can be converted into documents/slides to present to stakeholders.
With widget libraries like ipywidgets and bqplot, users can create rich applications, dashboards and tools by just using python code.
In this talk, we will see how we can build interactive visualizations in the Jupyter notebook. In the first part of the talk, I'll introduce the widget libraries and walk you through the code of a simple example so we understand how to assemble and link these widgets. Then we'll look at usecases including building dashboards from server logs, twitter sentiment analysis and finally tools for building, training and diagnosing deep learning models.
Tracks
Monday, 5 November
-
Microservices / Serverless Patterns & Practices
Evolving, observing, persisting, and building modern microservices
-
Practices of DevOps & Lean Thinking
Practical approaches using DevOps & Lean Thinking
-
JavaScript & Web Tech
Beyond JavaScript in the Browser. Exploring WebAssembly, Electron, & Modern Frameworks
-
Modern CS in the Real World
Thoughts pushing software forward, including consensus, CRDT's, formal methods, & probabilistic programming
-
Modern Operating Systems
Applied, practical, & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on Windows, LinuxKit, and Unikernels
-
Optimizing You: Human Skills for Individuals
Better teams start with a better self. Learn practical skills for IC
Tuesday, 6 November
-
Architectures You've Always Wondered About
Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more
-
21st Century Languages
Lessons learned from languages like Rust, Go-lang, Swift, Kotlin, and more.
-
Emerging Trends in Data Engineering
Showcasing DataEng tech and highlighting the strengths of each in real-world applications.
-
Bare Knuckle Performance
Killing latency and getting the most out of your hardware
-
Socially Conscious Software
Building socially responsible software that protects users privacy & safety
-
Delivering on the Promise of Containers
Runtime containers, libraries, and services that power microservices
Wednesday, 7 November
-
Applied AI & Machine Learning
Applied machine learning lessons for SWEs, including tech around TensorFlow, TPUs, Keras, PyTorch, & more
-
Production Readiness: Building Resilient Systems
More than just building software, building deployable production ready software
-
Developer Experience: Level up your Engineering Effectiveness
Improving the end to end developer experience - design, dev, test, deploy, operate/understand.
-
Security: Lessons Attacking & Defending
Security from the defender's AND the attacker's point of view
-
Future of Human Computer Interaction
IoT, voice, mobile: Interfaces pushing the boundary of what we consider to be the interface
-
Enterprise Languages
Workhorse languages found in modern enterprises. Expect Java, .NET, & Node in this track