Hien.Luu
Hien Luu
Architect at LinkedIn working with Hadoop technologiesHien Luu is a technical lead of the Data Services Platform team at LinkedIn where he focuses on building big data infrastructure and big data applications. He loves working with big data technologies and recently became a contributor of Apache Pig project. He enjoys teaching and is currently an instructor of the Hadoop: Big Data Processing course at UCSC Silicon Valley Extension school. He has given presentations at various conferences and user groups like Hadoop Summit 2013, JavaOne, Silicon Valley CodeCamp and SVForm Software & Architecture user group.
-
Processing Big Data with Apache Pig and Apache Hive
Location:Seacliff BDuration:Full DayAbstract:To process big data and build big data products at rapid pace requires highly productive data processing technologies. Apache Pig and Apache Hive are designed to allow data scientists, data analysts, and engineers to be highly productive and to iterate quickly when performing data processing at massive scale. This tutorial will not only provide hands-on experience working with Apache Pig and Apache Hive, but will also provide a glimpse at how they are used at LinkedIn to build big data products. Here is what you can expect to learn in this tutorial: Quick high level overview of Hadoop data processing framework Understanding how Apache Pig & Apache Hive fit into Hadoop data processing ecosystem Overview of Apache Pig architecture and data flow language Overview of Apache Hive architecture and query language Demonstration of writing and running Apache Pig scripts Demonstration of writing and running Apache Hive queries Discussion about the strengths and weaknesses of Apache Pig and Apache Hive and when to use which Glimpse of upcoming query processing technologies & productivity tools like Netflix Lipstick Target Audience: Architects and engineers that have an interest in big data topics Participants should have some high level knowledge of Hadoop