Presentation: "Large Scale Mapreduce Data Processing at Quantcast"

Time: Friday 10:35 - 11:35

Location: Franciscan I & II

Abstract: Quantcast is a leader in audience measurement and in ad targeting, through lookalike models that score hundreds of millions of cookies for customer-defined segments. Quantcast injects about 10 billion new events per day, and processes hundreds of terabytes of data daily. This presentation describes the data processing architecture that allows a startup to reliably and affordably run multiple complex production data processing systems, and flexible systems to support rapid innovation and ad hoc analysis. Topics include architecture of clusters for scalability and reliability, using dedicated clusters versus cloud processing, designing mapreduce jobs for testability and correctness, dependency management, hybrid architectures that incorporate databases with mapreduce, and the lifecycle of effective mapreduce development.

Ron Bodkin, Think Big Analytics and Former Xerox Parc AspectJ Committer

 Ron  Bodkin
Ron Bodkin is the founder of Think Big Analytics and works with Quantcast, an open ratings service for Web sites. Ron is also the founder of New Aspects of Software, which provides consulting and training on aspect-oriented software development and effective architectures for Java. Ron is also the leader of the open source Glassbox application performance troubleshooting project.

Previously, Ron led the first implementation projects and training efforts for customers of the AspectJ group at Xerox PARC. Prior to that, Ron was a founder and the CTO of C-bridge, a consultancy that delivered enterprise applications using Java frameworks.