Presentation: Scaling up Near Real Time Analytics@Uber &LinkedIn

Duration

Duration: 
5:25pm - 6:15pm

Persona:

Abstract

Modern businesses are pushing the limits of decision making. Advancements in stream processing and OLAP (Online Analytical Processing) technologies have enabled faster insights into the data coming in, thus powering near real time decisions. A lot of use cases such as Fraud detection, Operational dashboards, Financial Incentive pipelines and Experimentation (A/B testing) need SQL like access to such streaming data.

This talk focuses on how Uber and LinkedIn use Apache Samza, Apache Calcite and Pinot for powering such use cases. First half of the talk will go over our analytics platform: AthenaX used by data scientists and engineers for specifying data transformations (SQL on Streams) and make it available for querying by real time dashboards & maps within minutes. Second half will focus on what happens under the hood and challenges faced with respect to scale, at least once semantics, windowing, schema derivation and so on.

Speaker: Chinmay Soman

PMC Member/Commiter @SamzaStream & Staff software Engineer @Uber

Chinmay Soman is a software engineer in Uber. His areas of interest include distributed systems and security. He started out in IBM where he worked on distributed filesystems (NFS) and replication technologies. He then joined the data infrastructure team in LinkedIn and worked on Voldemort - an open source distributed key-value store, as well as Apache Samza. He's currently the tech lead of streaming platform team at Uber, building a self service platform for doing near real time analytics. He's also a Committer and PMC member in Apache Samza.

Find Chinmay Soman at

Speaker: Yi Pan

PMC Member/Commiter @SamzaStream & Distributed Systems Engineer @Linkedin

Yi Pan has worked in the distributed platforms for Internet applications for 8 years. He started in Yahoo! on Yahoo!'s NoSQL database project, leading the development of multiple features, such as real-time notification of database updates, secondary index, and live-migration from legacy systems to NoSQL database. He joined and led the distributed Cloud Messaging System project later, which is used heavily as a pub-sub and transaction logs for distributed databases in Yahoo!. From 2014, he joined LinkedIn and has quickly become the lead of Samza team in LinkedIn and a Committer and PMC member in Apache Samza.

Find Yi Pan at

Similar Talks

Sr. Staff Engineer @Uber, Co-founder @Voxer

.

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9

Conference for Professional Software Developers