Workshop: Continuous Application with Apache Spark 2.0

Location:

Level: 
Intermediate

When:

9:00am - 12:00pm

Prerequisites

All participants should being a laptop with Chrome or Firefox installed.
You will get a free account to Databricks Community Edition, which will give you unlimited free access to a ~4 GB Spark 2.0 local mode cluster.
You should be familiar with a basic programming language like Python and knowing some SQL and Pandas DataFrames will be beneficial.

A Continuous Application is an end-to-end application that reacts to data in real-time. But it is more than a typical event-based streaming app. Continuous applications capture input streams, blend them when static/offline data and sometimes apply machine learning to the combined data before serving the results back out. These modern applications support quick ad-hoc queries along with long running batch queries.

In today's session Sameer and Jules from the Evangelism team at Databricks will show you how to build a continuous application using a single API. Apache Spark 2.0 provides a high-level API to easily combine SQL, DataFrames, Streaming, Machine Learning and Graph Processing. Through hands on coding sessions and using demo prototype code, we will show you how a small team or single developer can build these sophisticated modern applications.

Speaker: Jules Damji

Spark Community Evangelist @Databricks

Jules S. Damji is a Apache Spark Community Evangelist with Databricks. He is a hands-on developer with over 15 years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, LoudCloud/Opsware, VeriSign, and ProQuest, building large-scale distributed systems. Before joining Databricks, he was a Developer Advocate at Hortonworks.

Find Jules Damji at

Speaker: Sameer Farooqui

Client Services Engineer @Databricks

Sameer is a Client Services Engineer at Databricks, where he works with customers on Apache Spark deployments. He has extensive industry expertise in the Hadoop ecosystem, Cassandra, Couchbase and general NoSQL domain. Prior to Databricks, Sameer worked 2 years as a freelance big data consultant + trainer globally and taught 100+ big data courses. Before that, Sameer was a Systems Architect at Hortonworks, an Emerging Data Platforms Consultant at Accenture R&D and a Enterprise Consultant for Symantec/VERITAS (specializing in VCS, VVR, SF-HA).

Find Sameer Farooqui at

.

Tracks

Monday Nov 7

Tuesday Nov 8

Wednesday Nov 9