Workshop: Apache 2.x Workshop

Location:

Level: 
Beginner

When:

9:00am - 12:00pm

Prerequisites

  • Basic familiarity and usage with Apache Spark is helpful
  • Basic programming experience in objected-oriented or functional language is required
  • The exercises will mostly written in Scala
  • Create a free Databricks Community Edition account

Apache Spark has become one of the must-know big data technologies due to its speed, ease of use, and flexibility. With each new version, Spark provides more powerful features to make it even easier than before to build intelligent and scalable data processing infrastructure and applications. This workshop will cover the major features in Spark 2.x and include focused and interactive hands on exercises.

Key Takeaways:

  • Understand Spark architecture and execution model
  • Learn structured data processing with Spark SQL, DataFrames and Datasets
  • Apply powerful Spark SQL functions and user defined function (UDF)
  • Perform streaming processing with Spark Structured Streaming

Speaker: Hien Luu

Engineering Manager @Linkedin focused on Big Data

Hien Luu is an engineering manager at LinkedIn and he is a big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He has given presentations at various conferences like QCon SF, QCon London, Hadoop Summit, JavaOne, ArchSummit and Lucene/Solr Revolution.

Find Hien Luu at

.

Tracks

  • 21st Century Languages

    Compile to Native, Microservices, Machine learning... tailor-made languages solving modern challenges, featuring use cases around Go, Rust, C#, and Elm.

  • Architectures You've Always Wondered About

    Architectural practices from the world's most well-known properties, featuring startups, massive scale, evolving architectures, and software tools used by nearly all of us.

  • Beyond Being an Individual Contributor

    Beyond being an individual contributor. Building and Evolving managers and tech leadership.

  • DevOps: You Build It, You Run It

    Pushing DevOps beyond adoption into cultural change. Hear about designing resilience, managing alerting, CI/CD lessons, & security. Features lessons from open source, Linkedin, Netflix, Financial Times, & more. 

  • Performance Mythbusting

    Real world, applied performance proofs across stacks. Hear performance consideratiosn for .NET, Python, & Java. Learn performance use cases with OpenJ9, Instagram, and Netflix. 

  • The Practice & Frontiers of AI

    Learn about machine learning in practice and on the horizon. Learn about ML at Quora, Uber's Michelangelo, ML workflow with Netflix Meson and topics on Bots, Conversational interfaces, automation, and deployment practices in the space.

  • Going Serverless

    Learn about the state of Serverless & how to successfully leverage it! Lessons learned in the track hit on security, scalability, IoT, and offer warnings to watch out for.

  • Microservices: Patterns and Practices

    Stories of success and failure building modern Microservices, including event sourcing, reactive, decomposition, & more.

  • Evolving Java

    Java continues to evolve & change. Track covers Spring 5, async, Kotlin, serverless, the 6-month cadence plans, & AI/ML use cases.

  • The Art of Chaos Engineering

    Failure is going to happen - Are you ready? Chaos engineering is an emerging discipline - What is the state of the art?

  • Security: Attacking and Defending

    Offense and defensive security evolution that application developers should know about including SGX Enclaves, effects of AI, software exploitation techniques, & crowd defense

  • Stream Processing In The Modern Age

    Compelling applications of stream processing using Flink, Beam, Spark, Strymon & recent advances in the field, including Custom Windowing, Stateful Streaming, SQL over Streams.