Presentation: "Amazon and Hadoop"
Track:
Architecture Quality (day 1)
Time: Wednesday 17:15 - 18:15
Location: Metropolitan I
Abstract: Hadoop, one incarnation of MapReduce algorithm, is a framework that allows computation of large datasets by splitting the dataset into manageable chunks, spreading it across a fleet of machines and managing the overall process by launching jobs, processing the job no matter where the data is physically located and, at the end, aggregating the job output into a final result.
Amazon Web Services? Alexa Web Search Service uses Hadoop in production. Developers are now able to run a small regular expression over millions of documents crawled by Alexa and filter the search results, in a cost-effective manner.
In this talk, Jinesh Varia, Evangelist for Amazon Web Services, will share some of the architectural insights of this usecase, problems we faced and problem it solved.