Infrastructure
Presentations about Infrastructure
If You Don’t Know Where You’re Going, It Doesn’t Matter How Fast You Get There
If You Don’t Know Where You’re Going, It Doesn’t Matter How Fast You Get There
Using Data Effectively: Beyond Art and Science
Yes, I Test In Production (And So Do You)
Human-Centric Machine Learning Infrastructure @Netflix
Full Cycle Developers @Netflix
Service Ownership @Slack
Day Two Kubernetes: Tools for Operability
Building Resilience in Production Migrations
DevOps For The Database
Evolving Continuous Integration: Applying CI to CI Strategy
Michelangelo - Machine Learning @Uber
DevOps & Lean Thinking Panel
DevOps & Lean Thinking Panel
DevOps & Lean Thinking Panel
DevOps & Lean Thinking Panel
DevOps & Lean Thinking Panel
DevOps & Lean Thinking Panel
Actionable Continuous Delivery Metrics
Reactive Cloud-Native Networking With RSocket
Continuous Reliability
Building Recommender Systems w/ Apache Spark 2.x
Interviews
Yes, I Test In Production (And So Do You)
What's the motivation for this talk?
The motivation for this talk is to help people understand that deploying software carries an irreducible element of uncertainty and risk. Trying too hard to prevent failures will actually make your systems and your teams *more* vulnerable to failure and prolonged downtime. So what can you do about it?
Martin Fowler famously had a blog post about how “you must be this tall for microservices” or organizational maturity before you’re really ready for microservices. Is there a “you must be this tall” equivalent for being able to test in production?
Absolutely. Starting with observability. Until you can drill down and explain any anomalous raw event, you should not attempt any advanced maneuvers - you’ll just be irresponsibly screwing with production and starting fires right and left that you don’t even know about. And I mean observability in the technical sense -- traditional monitoring is not good enough. You need instrumentation, an event-oriented perspective, ideally some tracing, etc. You need the tooling that lets you hunt down outliers and aggregate at read time by high-cardinality dimensions like request ID. No aggregation at write time. Etc.