Workshop: Apache 2.x Workshop




9:00am - 12:00pm


  • Basic familiarity and usage with Apache Spark is helpful
  • Basic programming experience in objected-oriented or functional language is required
  • The exercises will mostly written in Scala
  • Create a free Databricks Community Edition account

Apache Spark has become one of the must-know big data technologies due to its speed, ease of use, and flexibility. With each new version, Spark provides more powerful features to make it even easier than before to build intelligent and scalable data processing infrastructure and applications. This workshop will cover the major features in Spark 2.x and include focused and interactive hands on exercises.

Key Takeaways:

  • Understand Spark architecture and execution model
  • Learn structured data processing with Spark SQL, DataFrames and Datasets
  • Apply powerful Spark SQL functions and user defined function (UDF)
  • Perform streaming processing with Spark Structured Streaming

Speaker: Hien Luu

Engineering Manager @Linkedin focused on Big Data

Hien Luu is an engineering manager at LinkedIn and he is a big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He has given presentations at various conferences like QCon SF, QCon London, Hadoop Summit, JavaOne, ArchSummit and Lucene/Solr Revolution.

Find Hien Luu at