Track: Architectures You've Always Wondered About

Location: Ballroom A

Day of week: Monday

How do they do it? In QCon's marquee Architectures track, we learn what it takes to operate at large scale from well-known names in our industry. You will take away hard-earned architectural lessons on scalability, reliability, throughput, and performance.

Track Host: Randy Shoup

VP Engineering @WeWork, Previously @StitchFix @Google & @Ebay

Randy is a 30-year veteran of Silicon Valley, and has worked as a senior technology leader and executive at companies ranging from small startups, to mid-sized places, to eBay and Google. Randy is currently VP Engineering at WeWork in San Francisco. He is particularly passionate about the nexus of culture, technology, and organization.

10:35am - 11:25am

Scaling Patterns for Netflix's Edge

In 2008 Netflix had less than a million streaming members. Today we have over 150 million. That explosive growth in membership has led to a similar growth in the number of microservices, in the amount of cloud resources, and our overall architectural complexity. Eventually, that sheer number of computation resources becomes hard to manage and sacrifices our reliability. At Netflix, we’ve found a few techniques that have helped keep our computation growth manageable and reliable.

There are the obvious tasks of performance tuning, reducing features, or reducing data. Going beyond just “tightening the belt” tactics, we had to rethink how we handle every request. At our scale, we can no longer call a customer database on every request, we can no longer fan out to a cascade of mid-tier requests on every request, and we can no longer log every request, so we don’t. This session will introduce the architectural patterns we’ve adopted to accomplish skipping those steps, which would normally be considered required for a functioning system.

I will also be sharing successes we’ve had from unintuitively partitioning computation into multiple services to get better runtime characteristics. Through this session, you will be introduced to useful probabilistic data structures, innovative bi-directional data passing, and open-source projects available from Netflix that make this all possible.

Justin Ryan, Playback Edge Engineering @Netflix

11:50am - 12:40pm

Secrets at Planet-Scale: Engineering the Internal Google KMS

We propose to discuss Google’s internal key management system for cryptographic key material which is a critical part of Google's overall strategy for user data protection. The talk will cover the design choices and strategies that Google chose in order to build a highly reliable, highly scalable service. The talk will close with continued maintenance pain points and suggested practices for your own internal key management service.  

This internal KMS underlies most storage, authentication, cross-site scripting forgery, and other critical security systems at Google, and hence needs to have very high availability. Furthermore, Google’s internal KMS not only manages the generation, distribution and rotation of cryptographic keys, but it also manages other secret data. Google’s internal KMS serves a massive volume of queries, more per second than Gmail or any single Google service, and needs to be very reliable in order to do so, historically performing at more than 99.9999% availability.  

The design choices that favored high availability have caused a few pain points for our clients. An example is the delay introduced between clients updating their keys/configs and the changes being reflected in production. For many of the system’s clients this delay is too long. We’ll discuss this and other pain points, and how we’re improving the user experience.

Anvita Pandit, Software Developer @Google

1:40pm - 2:30pm

Architectures That Scale Deep - Regaining Control in Deep Systems

We often hear about architectural "scale" as if it's one-dimension and linear. In fact, it is neither, and that's breaking our tools and processes. Where modern, microservice-based architectures are concerned, "large-scale systems" aren't simply larger versions of "small-scale" systems – they are something completely different. Enter the "Deep System."  

In this talk, we first develop a shared intuition and formal definition for "Deep Systems" and their common properties: they are layered, distributed, concurrent, multi-tenant, change continuously, and are a beast to manage with conventional tools! We then re-introduce the fundamentals of control theory from the 1960s, including the original conceptualizations of Observability and its conceptual cousin, Controllability. Finally, we use examples from Google and other organizations to illustrate how deep systems have damaged our ability to observe software, and what we need to do in order to regain confidence and control.

Ben Sigelman, CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard

2:55pm - 3:45pm

Evolutionary Architecture as Product @ CircleCI

Organizations continually evolve their technical architectures in order to adjust to the changing needs of their business.  For example: systems must scale with increasing customer demand, tools must create efficiency in growing teams, and implementations are generalized to support additional product features.  At CircleCI, we face all of these drivers, but our role in the software delivery pipeline means we have the additional need to adapt to changes in how software is being built.

And the rate of change in software development approaches is like no other.

CircleCI's history has involved constantly adapting our product architecture to match transformations in the world of software development. From the explosive adoption of Docker to the steady rise of microservice architectures, the changing demands of software engineering teams have proven to be deeply coupled with the structure of CircleCI's service–far more than we anticipated when we started the business 8 years ago.

This talk will cover:

  • How the evolution of software development since 2011 has driven the evolution of CircleCI's architecture
  • Managing the cost of change when customers have the ability to customize almost anything
  • Predictions of future trends in software delivery and the architectural approaches we will take to support them

Robert Zuber, CTO @CircleCI

4:10pm - 5:00pm

Snowflake Architecture: Building a Data Warehouse for the Cloud

At Snowflake, we wanted to architect a data warehouse from the ground up to leverage all the benefits of the cloud. Unlike shared-storage architectures that tie storage and compute together, we built a single integrated system with fully independent scaling for compute, storage and services. In the storage layer, we split data into micro-partitions and extract metadata for efficient query processing. At the compute layer, multiple virtual warehouses in separate compute clusters can simultaneously operate on the same data, giving high availability, performance isolation, scalability and concurrency. Virtual warehouses can also be automatically scaled up and down based on workload and performance.

This talk will cover the three pillars of the Snowflake architecture: 

  • Separating compute and storage to leverage abundant cloud compute resources
  • Building an ACID compliant database system on immutable storage
  • Delivering a scalable multi-tenant data warehouse system as a service

Thierry Cruanes, Co-founder Snowflake Computing @SnowflakeDB

5:25pm - 6:15pm

Architectures Panel

How do big operators differ from smaller disruptors? This panel will examine the different architectures that power these systems.

Justin Ryan, Playback Edge Engineering @Netflix
Anvita Pandit, Software Developer @Google
Ben Sigelman, CEO and co-founder @LightStepHQ, Co-creator @OpenTracing API standard
Robert Zuber, CTO @CircleCI
Thierry Cruanes, Co-founder Snowflake Computing @SnowflakeDB

Tracks

Monday, 11 November

Tuesday, 12 November

Wednesday, 13 November