Conference: Nov 5-7, 2018
Workshops: Nov 8–9, 2018
Presentation: Avoiding Alerts Overload From Microservices
Share this on:
What You’ll Learn
- Focus on the stuff that matters when it comes to developing Microservices
- Understand how to build a system with support in mind
- Understand practical approaches to optimizing alerts and the infrastructure to support them
Abstract
Microservices can be a great way to work: the services are simple, you can use the right technology for the job, and deployments become smaller and less risky. Unfortunately, other things become more complex. You probably took some time to design a deployment pipeline and set up self-service provisioning, for example. But did the rest of your thinking about what “done” means catch up? Are you still setting up alerts, run books, and monitoring for each microservice as though it was a monolith?
Two years ago, a team at the FT started out building a microservices-based system from scratch. Their initial naive approach to monitoring meant that an underlying network issue could mean 20 people each receiving 10,000 alert emails overnight. With that volume, you can’t pick out the important stuff. In fact, your inbox is unusable unless you have everything filtered away where you’ll never see it. Furthermore, you have information radiators all over the place, but there’s always something flashing or the wrong color. You can spend the whole day moving from one attention-grabbing screen to another.
That team now has over 150 microservices in production. So how they get themselves out of that mess and regain control of their inboxes and their time? First, you have to work out what’s important, and then you have to ruthlessly narrow down on that. You need to be able to see only the things you need to take action on in a way that tells you exactly what you need to do. Sarah shares how her team regained control and offers some tips and tricks.
Interview
I’m the tech lead for the Content platform at the Financial Times. The platform handles publication of content from multiple content management systems, annotating that content with metadata via concept extraction and editorial curation, and making all of that information available via a set of APIs, so any product within the FT or outside that delivers our content has a stable base to build on.
As part of this work, we are completely revamping our metadata from a taxonomy based metadata - essentially, lists of terms in a variety of categories such as authors, people, companies - with an ontology based one, i.e. based on real things, with an ability to navigate the relationships between those things. As an example, in our new metadata, an author is a person who is a writer. This means we don’t have the same name appearing as a term within people, authors and brands (this is highly confusing to deal with) and gives us a lot more flexibility to show content based on that metadata.
We have five development teams working on our system, which is made up of nearly 300 microservices. We were early adopters of docker, building a lot of the cluster management tools ourselves. My focus this year is making sure that all the work we do on our platform works towards our functional and architectural goals - no local optimizations - and that our production stack is as stable and easy to use as possible so we spend time on the things that matter to our business: for example, we are migrating to Kubernetes and replacing some of our hand written tools.
This was the first microservices architecture I’ve worked on, and one of the early things I learnt is that you just can’t operate microservices the way you did a monolith.
Microservices make the code easier to reason about and deploy frequently, but you have to do DevOps to make it work, and you have to keep a keen focus on building things for operability.
In the early days, my inbox was full of alerts every day, and it was very hard to work out what was something I needed to take action on. We’ve invested a lot of time in finding ways to solve that problem and I wanted to share that experience with others.
Tech Lead/Architect/Developer/Senior Management: anyone operating a microservices architecture or planning to.
It’s a technical talk. It assumes you know about microservices, that you have been responsible for supporting a system.
There are lots of concrete suggestions of things to install or develop that will help with operating a microservices-based system, but I hope the main actionable will be that you have to care about this stuff and work on it constantly.
Serverless feels like it could completely change things. At the moment I’m seeing us use Lambda mostly for more ‘housekeeping’ tasks rather than for production-critical systems, but that’s going to change and I can see the attraction of not having a server to maintain.
I wonder how easy it will be to support a system made up hundreds of functions running on hardware you can’t see. It’ll be a whole new set of observability challenges!
Similar Talks
.
Tracks
-
Architectures You've Always Wondered About
Architectural practices from the world's most well-known properties, featuring startups, massive scale, evolving architectures, and software tools used by nearly all of us.
-
Going Serverless
Learn about the state of Serverless & how to successfully leverage it! Lessons learned in the track hit on security, scalability, IoT, and offer warnings to watch out for.
-
Microservices: Patterns and Practices
Stories of success and failure building modern Microservices, including event sourcing, reactive, decomposition, & more.
-
DevOps: You Build It, You Run It
Pushing DevOps beyond adoption into cultural change. Hear about designing resilience, managing alerting, CI/CD lessons, & security. Features lessons from open source, Linkedin, Netflix, Financial Times, & more.
-
The Art of Chaos Engineering
Failure is going to happen - Are you ready? Chaos engineering is an emerging discipline - What is the state of the art?
-
The Whole Engineer
Success as an engineer is more than writing code. Hear inward looking thoughts on inclusion, attitude, leadership, remote working, and not becoming the brilliant jerk.
-
Evolving Java
Java continues to evolve & change. Track covers Spring 5, async, Kotlin, serverless, the 6-month cadence plans, & AI/ML use cases.
-
Security: Attacking and Defending
Offense and defensive security evolution that application developers should know about including SGX Enclaves, effects of AI, software exploitation techniques, & crowd defense
-
The Practice & Frontiers of AI
Learn about machine learning in practice and on the horizon. Learn about ML at Quora, Uber's Michelangelo, ML workflow with Netflix Meson and topics on Bots, Conversational interfaces, automation, and deployment practices in the space.
-
21st Century Languages
Compile to Native, Microservices, Machine learning... tailor-made languages solving modern challenges, featuring use cases around Go, Rust, C#, and Elm.
-
Modern CS in the Real World
Applied trends in Computer Science that are likely to affect Software Engineers today. Topics include category theory, crypto, CRDT's, logic-based automated reasoning, and more.
-
Stream Processing In The Modern Age
Compelling applications of stream processing using Flink, Beam, Spark, Strymon & recent advances in the field, including Custom Windowing, Stateful Streaming, SQL over Streams.
-
Performance Mythbusting
Real world, applied performance proofs across stacks. Hear performance consideratiosn for .NET, Python, & Java. Learn performance use cases with OpenJ9, Instagram, and Netflix.
-
Tools and Culture: What's Beyond a Stack of Containers?
Containers are not just a techology. It's a platform. Push your knowledge.
-
Web as Platform
All things Browser, from JavaScript Frameworks for animation and AR / VR to Web Assembly and from protocol work to open standards evolution.
-
Beyond Being an Individual Contributor
Beyond being an individual contributor. Building and Evolving managers and tech leadership.
-
Building Great Engineering Cultures
Why engineering culture matters. Track features org scaling, memes as a culture tool, Ally skills, and panels on diversity / inclusion.
-
Hardware Frontiers: Changes Affecting Software Developers Today
Topics around: Quantum computing, NVM, SMR, GPU, custom hardware, self-driving cars, and mobile hardware.