Presentation: "Large Scale Mapreduce Data Processing at Quantcast"
Time: Friday 10:35 - 11:35
Location: Franciscan I & II
Abstract: Quantcast is a leader in audience measurement and in ad targeting, through lookalike models that score hundreds of millions of cookies for customer-defined segments. Quantcast injects about 10 billion new events per day, and processes hundreds of terabytes of data daily. This presentation describes the data processing architecture that allows a startup to reliably and affordably run multiple complex production data processing systems, and flexible systems to support rapid innovation and ad hoc analysis. Topics include architecture of clusters for scalability and reliability, using dedicated clusters versus cloud processing, designing mapreduce jobs for testability and correctness, dependency management, hybrid architectures that incorporate databases with mapreduce, and the lifecycle of effective mapreduce development.