You are viewing content from a past/completed QCon

Workshop: [SOLD OUT] Building Recommender Systems w/ Apache Spark 2.x

Location: Garden A

Duration: 9:00am - 4:00pm

Day of week: Friday

Level: Beginner

Prerequisites

  • Basic familiarity and usage with Apache Spark is helpful
  • Basic programming experience in objected-oriented or functional language is required
  • The exercises will mostly be written in Scala
  • Participants should bring their laptop

Apache Spark has become one of the must-know big data technologies due to its speed, ease of use, and versatility. Spark can be used for performing data analysis and building big-data applications. Increasingly, companies are leveraging Apache Spark to build intelligent applications that use Machine Learning techniques. This workshop will start with covering the major features in Spark 2.x and then focus on building a recommendation system using Spark MLlib library. It will include focused and interactive hands-on exercises.

Signup for a free Databricks Community Edition account - https://community.cloud.databricks.com/

Tutorial materials can be found at - https://sites.google.com/view/apache-spark-workshop/

Here is what you can expect to learn from this tutorial:

  • Spark architecture and execution model
  • Structured data processing with Spark SQL, DataFrames, and Datasets
  • Streaming processing with Structure Streaming
  • Major concepts and utilities in Spark ML library for building intelligent applications
  • Build a recommender system using Spark ML library

Speaker: Hien Luu

Engineering Manager @Linkedin focused on Big Data

Hien Luu is an engineering manager at LinkedIn and he is a big data enthusiast. He is particularly passionate about the intersection between Big Data and Artificial Intelligence. Teaching is one his passions and he is currently teaching Apache Spark course at UCSC Silicon Valley Extension school. He has given presentations at various conferences like QCon SF, QCon London, Hadoop Summit, JavaOne, ArchSummit and Lucene/Solr Revolution.

Find Hien Luu at

Tracks

  • Building & Scaling High-Performing Teams

    To have a high-performing team, everybody on it has to feel and act like an owner. Organizational health and psychological safety are foundational underpinnings to support ownership.

  • Evolving the JVM

    The JVM continues to evolve. We’ll look at how things are evolving. Covering Kotlin, Clojure, Java, OpenJDK, and Graal. Expect polyglot, multi-VM, performance, and more.

  • Trust, Safety & Security

    Privacy, confidentiality, safety and security: learning from the frontlines.

  • JavaScript & Transpiler/WebAssembly Track

    JavaScript is the language of the web. Latest practices for JavaScript development in and how transpilers are affecting the way we work. We’ll also look at the work being done with WebAssembly.

  • Living on the Edge: The World of Edge Compute From Device to Application Edge

    Applied, practical & real-world deep-dive into industry adoption of OS, containers and virtualization, including Linux on.

  • Software Supply Chain

    Securing the container image supply chain (containers + orchestration + security + DevOps).

  • Modern CS in the Real World

    Thoughts pushing software forward, including consensus, CRDT's, formal methods & probabilistic programming.

  • Tech Ethics: The Intersection of Human Welfare & STEM

    What does it mean to be ethical in software? Hear how the discussion is evolving and what is being said in ethics.

  • Optimizing Yourself: Human Skills for Individuals

    Better teams start with a better self. Learn practical skills for IC.

  • Modern Data Architectures

    Today’s systems move huge volumes of data. Hear how places like LinkedIn, Facebook, Uber and more built their systems and learn from their mistakes.

  • Practices of DevOps & Lean Thinking

    Practical approaches using DevOps and a lean approach to delivering software.

  • Microservices Patterns & Practices

    What's the last mile for deploying your service? Learn techniques from the world's most innovative shops on managing and operating Microservices at scale.

  • Bare Knuckle Performance

    Killing latency and getting the most out of your hardware

  • Architectures You've Always Wondered About

    Next-gen architectures from the most admired companies in software, such as Netflix, Google, Facebook, Twitter, & more

  • Machine Learning for Developers

    AI/ML is more approachable than ever. Discover how deep learning and ML is being used in practice. Topics include: TensorFlow, TPUs, Keras, PyTorch & more. No PhD required.

  • Production Readiness: Building Resilient Systems

    Making systems resilient involves people and tech. Learn about strategies being used from chaos testing to distributed systems clustering.

  • Regulation, Risk and Compliance

    With so much uncertainty, how do you bulkhead your organization and technology choices? Learn strategies for dealing with uncertainty.

  • Languages of Infrastructure

    This track explores languages being used to code the infrastructure. Expect practices on toolkits and languages like Cloudformation, Terraform, Python, Go, Rust, Erlang.