
Scaling AncestryDNA using Hadoop and HBase

Scaling AncestryDNA using Hadoop and HBase

Grand Ballroom A
Monday, 2:50pm - 3:40pm

What do you get when you take Bioinformatics Scientists with PhDs and mix them up with Software Engineers? Why Ancestry DNA on Hadoop and HBase! Get the whole story from both the management (Bill Yetman, Sr. Director) and developer (Jeremy Pollack, Principle Engineer/Team Lead) points of view. Find out how this unique cast of characters took academic programs and created an industrial, scalable, DNA processing pipeline (a real Big Data problem) using Hadoop and HBase. The final implementation provided a 1700% performance improvement.


You don’t know how DNA matching works? No worries. We’ll provide a simple example so you follow along. A full autosomal test, 700,000 SNPs used for ethnicity and matching, a DNA pool size of 120,000 samples, and over 6 million 4th cousin matches already delivered to our users. Learn how Agile techniques (start simple, get going, iterate), the “measure everything” principle, and a unique mix of scientists and engineers worked together to create a truly unique breakthrough architecture – and created a unique Family History Product along the way.

Jeremy.Pollack's picture
Jeremy is a senior engineer at Ancestry.com, where he supports a team of scientists and makes their discoveries scale. In the past, he’s written code that withstood the traffic from a Superbowl ad, created the content management system for one of the web’s most popular parenting sites, and oversaw the technology needs of a well-known online magazine. When he’s not coding, he enjoys reading, playing the darbuka, and throwing awesome dinner parties.
Bill.Yetman's picture
Bill Yetman has served as Senior Director of Engineering at Ancestry.com since January 2011. Bill has held multiple positions with Ancestry.com from August 2002, including Senior Director of Engineering, Director of Sites, Mobile and APIs, Director of Ad Operations and Ad Sales, Senior Software Manager of eCommerce and Senior Software Devloper. Prior to joining Ancestry.com, he held several developer and programmer roles with Coresoft Technologies, Inc., Novell/Word Perfect, Fujitsu Systems of America and NCR. Mr. Yetman holds a B.S. in Computer Science and a B.A. in Psychology from San Diego State University.