Presentation: Observability in the SSC: Seeing Into Your Build System

Track: Software Supply Chain

Location: Seacliff ABC

Duration: 2:55pm - 3:45pm

Day of week:

Slides: Download Slides

This presentation is now available to view on InfoQ.com

Watch video with transcript

What You’ll Learn

  1. See how tools from production can be used to get insight into the build process. 
  2. Learn how to go from understanding your process to faster builds.

Abstract

Waiting for a slow build can really kick you out of the groove. Finding flaky tests using data instead of instinct increases trust. You and your team have a collection of sophisticated tools available to understand the complex applications you have running in production. Using these same tools to gain insight into your CI/CD pipeline enables your team to improve processes with the same rigor as performance analysis in production.  

Honeycomb hit a time when our builds slowly got longer and longer until, without noticing it, everybody was super frustrated. We used the tools we had available to explore instrumentation in the CI environment and visualized the data we found as traces and queries over time. With that insight we dropped build times by 40% and gave ourselves the ability to track build times and asset sizes over time. This talk walks through that transformation and covers the techniques you can use to accomplish the same goals in your environments.

Question: 

What is the work you're doing today?

Answer: 

I'm working at Honeycomb. We are building observability tools -- a new way for people to better understand performance, errors, and application behavior in production. A lot of APM is focused on performance but we really seek to understand _everything_ about how the application is behaving, both in expected and unexpected ways.

The focus is on deep instrumentation and wide events to really cover what's going on with your application when it's running. My work there is focused on the ingestion part and the back end; I'm not as focused on the presentation layer and the Web UI. There's a lot of data to move around, an enormous amount of analysis to do, and it's a very exciting problem.

Question: 

What are your goals for the talk?

Answer: 

As a young startup, we hit a number of issues along the way that are totally common. We have our automated builds and as any codebase grows, the build slows down. We spend a certain amount of time making them better. And it's a recurring thing.

Honeycomb is a company and product that is built around trying to understand the workings of complex production processes. We chose to apply our own tool to this build process and see what happens. The visibility into this normally opaque black box has been stunning. This is an interesting idea, taking the tools that we normally apply to production infrastructure, whether it's tracing or metrics or anything else, and using those to influence more of the software supply chain, more of the build and test processes. The path from commit to deploy; all of these things can benefit from using the same toolset that we (as operators) are already experienced with for running complicated applications.

By doing so, we can really improve the lives of our own developers. We can improve the efficiency of our development process. This is an exciting area because there hasn't been a whole lot of work focused there for the most part. Some of the tools you might use in the software supply chain export some metrics, like github has its commit history, build systems have a small amount of timings and their status around the builds. We have other really fancy tools, and as developers and operators, we know how to understand very complex systems using these tools and we should take advantage of that. 

My talk is one story of using one of these tools to understand our build system. I’ll discuss the benefits we got from it and the realization along the way that it actually wasn't that different and it wasn't that complicated. I want to call out to the rest of the world to take these tools and see how they can apply to the systems that have normally been hidden behind curtains and very opaque. 

Question: 

 I really love that insight because it's one of those things that seems really obvious in retrospect. But it never occurred to me.

Answer: 

It didn't even occur to us either. We're just like, “Hey, we could do this thing” and it was fun. And then a couple of times at a conference or something, we would show somebody our builds and they’d go “What?! How did you get that?”

It actually took some external feedback to help make that leap of realization that actually this is a thing that could be more common. That should be more common. And that could really benefit all of our development processes when it does become a little more mature. 

Question: 

What do you want people to leave the talk with?

Answer: 

A couple of things. First, the understanding that build systems are not as impenetrable as they appear to be. The purpose of a build system is to run a bunch of commands, and by hooking into that and using some normal operating system and process tricks, you can get an enormous amount of insight into your build processes very easily. Second, that you can use this insight to focus your developer actions around maintaining your build system. And third, that there are huge areas of software supply chain outside of build and test that will benefit from this same kind of extra analysis.

I don't know how to do that last part! I want everyone in the audience to think about the tools that they're familiar with and that they have built or have experience using and see how they could apply them to different parts of the development lifecycle and make the next step towards observability in the software supply chain. 

Question: 

What do you think is the next big disruption in software?

Answer: 

In terms of like a real shift in the way humanity interacts with technology, my bet is on augmented reality as a path towards making the sci fi novel vision of VR a reality. There are a couple of companies that tried to go straight from where we are to virtual reality and they have loyal followings, but not wide adoption. There's a path to get there through AR that's one of the most interesting bits for me long term.

Speaker: Ben Hartshorne

Engineer @honeycombio

Ben Hartshorne is an engineer at Honeycomb. For the last 13 years, Ben has built monitoring, alerting, and observability systems for companies ranging from startups like Simply Hired and Parse to large organizations such as Wikimedia and Facebook. Strangely, he actually enjoys this work and is happy to finally be building a company and product that will help tease out nuances in data in novel and powerful ways.

Find Ben Hartshorne at

Similar Talks

Evolution of Edge @Netflix

Qcon

Engineering Leader @Netflix

Vasily Vlasov

Mistakes and Discoveries While Cultivating Ownership

Qcon

Engineering Manager @Netflix in Cloud Infrastructure

Aaron Blohowiak

Monitoring and Tracing @Netflix Streaming Data Infrastructure

Qcon

Architect & Engineer in Real Time Data Infrastructure Team @Netflix

Allen Wang

Future of Data Engineering

Qcon

Distinguished Engineer @WePay

Chris Riccomini

Coding without Complexity

Qcon

CEO/Cofounder @darklang

Ellen Chisa

Holistic EdTech & Diversity

Qcon

Holistic Tech Coach @unlockacademy

Antoine Patton

User & Device Identity for Microservices @ Netflix Scale

Qcon

Senior Software Engineer in Product Edge Access Services Team @Netflix

Satyajit Thadeshwar