Warning message

  • The service having id "twitter" is missing, reactivate its module or save again the list of services.
  • The service having id "facebook" is missing, reactivate its module or save again the list of services.
  • The service having id "google_plus" is missing, reactivate its module or save again the list of services.
  • The service having id "linkedin" is missing, reactivate its module or save again the list of services.

Workshop: Building Smarter Applications with Spark & H20

Location:

Level: 
Intermediate
9:00am - 4:00pm

Data is today’s clay and the newest killer applications are data products.

In this workshop, you will learn about the anatomy of a data product on Spark and H2O. Then we’ll show you how to mold your data in order to build a pipeline and a real-time ML system by building a complete loan interest rate prediction product. Our clay is going to be a public Lending Club dataset, and we will create a real-time Spark Streaming application that is both predictive and deploys library models.

The goal is to produce borrower interest rates that are comparable or better than human-led predictions.

Key Take Aways:

  • Clean and transform datasets in Sparkling Water
  • Join varying datasets (text and time series) by defining conformed dimensions
  • Use MLlib to implement word2vec for NLP and H2O for Gradient Boosting Machine in order to produce a scoring engine
  • Integrate the scoring engine from Sparkling Water models into Spark Streaming
  • Produce real-time scoring and predictions
  • Create a pipeline of ensemble models – retire and promote them based on risk and domain
  • Deploy smarter applications on Spark and Cloud

Tracks

Covering innovative topics

Monday Nov 16

Tuesday Nov 17

Wednesday Nov 18