You are viewing content from a past/completed QCon

Track: Emerging Trends in Data Engineering

Location: Bayview AB

Day of week: Tuesday

Data Engineering is becoming increasingly relevant to our highly-connected, AI driven world. In the past, software engineers focused their efforts on developing scalable web architectures until they realized that their biggest headache was their data architecture. For most of us, data architecture simply meant running an RDBMS for all of our needs, from transactional read-write workloads to ad-hoc point and scan analytics loads. As our data grew, so did our use-cases for data-driven products (e.g. fraud detection systems, recommender systems, personalization services) -- these 2 rising trends combined to stress our RDBMS beyond their capabilities. Data engineers entered the field to solve our problems by introducing specialized data stores (e.g. search engines, graph engines, large scale data processing (e.g. Spark), NoSQL, stream processing (E.g. Beam, Flink, Spark)) and the machinery to glue them together (e.g. ETL pipelines, Kafka, Sqoop, Flume). Today, data architectures are as vast and varied as the use-cases they supports. What are some emerging technologies and trends in this space and how are some of cutting-edge companies solving their problems? Come to this track to learn more.

Track Host: Sid Anand

Chief Data Engineer @PayPal

Sid Anand currently serves as PayPal's Chief Data Engineer, focusing on ways to realize the value of data. Prior to joining PayPal, he held several positions including Agari's Data Architect, a Technical Lead in Search @ LinkedIn, Netflix’s Cloud Data Architect, Etsy’s VP of Engineering, and several technical roles at eBay. Sid earned his BS and MS degrees in CS from Cornell University, where he focused on Distributed Systems. In his spare time, he is a maintainer/committer on Apache Airflow, a co-chair for QCon, and a frequent speaker at conferences. When not working, Sid spends time with his wife, Shalini, and their 2 kids.

10:35am - 11:25am

Data Engineering Open Space

11:50am - 12:40pm

Massively scaling MySQL using Vitess

Sugu Sougoumarane, Co-Founder / CTO @planetscaledata & Co-Creator @vitessio

1:40pm - 2:30pm

Transaction Processing in FoundationDB

Evan Tschannen, Lead Developer/Committer FoundationDB

2:55pm - 3:45pm

Patterns of Streaming Applications

Monal Daxini, Distributed Systems Engineer / Leader @Netflix

4:10pm - 5:00pm

Training Deep Learning Models at Scale on Kubernetes

Deepak Bobbarjung, Founding Engineer @PassageAI
Mitul Tiwari, CTO @PassageAI

5:25pm - 6:15pm

The Whys and Hows of Database Streaming

Batch-style ETL pipelines have been the de facto method for getting data from OLTP to OLAP database systems for a long time. At WePay, when we first built our data pipeline from MySQL to BigQuery, we adopted this tried-and-true approach. However, as our company scaled and our business needs grew, we observed a stronger demand for making data available for analytics in real-time. This led us to redesign our pipeline to a streaming-based approach using open-source technologies such as Debezium and Kafka.
This talk goes over the central design pattern around database streaming, change data capture (CDC), and what its advantages are over alternative approaches like trigger or event-sourcing. To solidify the concept, we will go through our MySQL-to-BigQuery streaming pipeline in detail, explaining the core components involved, and how we built this pipeline to be resilient to failure. Finally, we will expand on some of our on-going work around the additional challenges we face when streaming peer-to-peer distributed databases (i.e. Cassandra), and what some potential solutions around it are.

Joy Gao, Sr. Software Engineer @WePay

Proposed Tracks

  • Architectures You've Always Wondered About

    Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more

  • Machine Learning without a PhD

    AI/ML is more approachable than ever. Discover how deep learning and ML is being used in practice. Topics include: TensorFlow, TPUs, Keras, PyTorch & more. No PhD required.

  • Production Readiness: Building Resilient Systems

    Making systems resilient involves people and tech. Learn about strategies being used from chaos testing to distributed systems clustering.

  • Building Predictive Data Pipelines

    From personalized news feeds to engaging experiences that forecast demand: learn how innovators are building predictive systems in modern application development.

  • Modern Languages: The Right Language for the Job

    We're polyglot developers. Learn languages that excel at very specific tasks and remove undifferentiated heavy lifting at the language level.

  • Delivering on the Promise of Containers

    Runtime containers, libraries and services that power microservices.

  • Evolving Java & the JVM

    6 month cadence, cloud-native deployments, scale, Graal, Kotlin, and beyond. Learn how the role of Java and the JVM is evolving.

  • Trust, Safety & Security

    Privacy, confidentiality, safety and security: learning from the frontlines.

  • Beyond the Web: What’s Next for JavaScript

    JavaScript is the language of the web. Latest practices for JavaScript development in and out of the browser topics: react, serverless, npm, performance, & less traditional interfaces.

  • Modern Operating Systems

    Applied, practical & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on.

  • Optimizing You: Human Skills for Individuals

    Better teams start with a better self. Learn practical skills for IC.

  • Modern CS in the Real World

    Thoughts pushing software forward, including consensus, CRDT's, formal methods & probabilistic programming.

  • Human Systems: Hacking the Org

    Power of leadership, Engineering Metrics and strategies for shaping the org for velocity.

  • Building High-Performing Teams

    Building, maintaining, and growing a team balanced for skills and aptitudes. Constraint theory, systems thinking, lean, hiring/firing and performance improvement

  • Software Defined Infrastructure: Kubernetes, Service Meshes & Beyond

    Deploying, scaling and managing your services is undifferentiated heavy lifting. Hear stories, learn techniques and dive deep into what it means to code your infrastructure.

  • Practices of DevOps & Lean Thinking

    Practical approaches using DevOps and a lean approach to delivering software.

  • Operationalizing Microservices: Design, Deliver, Operate

    What's the last mile for deploying your service? Learn techniques from the world's most innovative shops on managing and operating Microservices at scale.

  • Developer Experience: Level up your Engineering Effectiveness

    Improving the end to end developer experience - design, dev, test, deploy and operate/understand.

The all-new QCon app!

Available on iOS and Android

The new QCon app helps you make the most of your conference experience. Easily browse and follow the conference schedule, star the talks you want to attend, and keep tabs on your personal itinerary. Download the app now for free on iOS and Android.