An Incident Story: Tips for How Staff+ Engineers Can Impact Incidents

In this talk, Erin Doyle, will walk through her experience with a critical 3-day-long incident. She will discuss how, as a Staff Engineer, she missed a key opportunity to help prevent the incident and how Staff+ Engineers can influence a culture that can prevent similar situations.  She’ll then talk about the lessons learned from how Staff+ Engineers can help improve incident management as well as the post-mortem process.  Lastly, she’ll touch on the automated solution she led implementation of in order to prevent these kinds of problems going forward.

This story-telling will show you:

  • Ways that Staff+ Engineers can help foster a culture that could have prevented such an incident.
  • Lessons learned on how technical leaders can more effectively manage incidents.
  • Tips for how to encourage more useful incident post-mortems.
  • Examples of how to take a pragmatic approach to implementing improvements after an incident.

What is the focus of your work these days?

I had been a full-stack developer for 20+ years and then at the beginning of 2023 I moved to a Platform team.  So my work now is in the Platform/DevOps/SRE space.  I especially have enjoyed working on projects that improve developers' workflow and experience, CI/CD, automation, and collaborating with Developers and setting up the infrastructure they need for their solutions.

What's the motivation for your talk?

My experience with this 3-day incident really opened my eyes to how much opportunity there is for Staff+ Engineers to positively affect the outcomes of incidents (or even prevent them altogether).  You can have tools and automation to make the process a little smoother but they don't help much if you don't have the right processes and culture in place around the tools.  When tensions and stakes are high, as humans, we can feel pressure, rush, cut corners, miss important things; we don't think clearly and aren't always at our best.  Having level-headed leaders with experience to drive and direct through the processes and safe-guards you put in place can make the difference.  And mindful leaders can plant the seeds and provide the needed care and feeding to foster a culture that can cut down on a number of contributing factors to incident root causes.

How would you describe the persona and level of the target audience?

The target audience for this talk includes all current or aspiring technical leaders.  While the track is Staff+ I believe this talk can apply to Senior Engineers and up, as well as Engineering Managers.  Whether you already have an incident management progress in place, have no process at all, or are somewhere in between, I hope you will find useful tips and perspectives in this talk.

What do you want this persona to walk away with from your presentation?

I hope attendees will walk away from this talk at the very least introspecting on the current culture and incident management process at their company.  I hope this talk will give them ideas for how they can influence and improve their company's culture or find places where their process can be improved or they can drive change around incidents.

What do you think is the next big disruption in software?

I really want to say something other than "AI" but I really do think we're at the cusp of something really big with AI and Machine Learning.  There are minds brainstorming all across the world on what we can do with this technology and I think we're going to see applications beyond our imaginations soon.  What I'm really interested in though is the application to the developer workflow.  We already have a bunch of tools and techniques popping up using AI to enhance developers' tasks.  It will be exciting to see what new tools and applications come out, what becomes part of the typical developer's quiver, and how these tools and techniques elevate developer productivity to allow for new levels of achievement and discovery.


Speaker

Erin Doyle

Staff Platform Engineer @Lob, with 20+ Years Previously as a Full Stack Engineer and Instructor @Egghead

Erin Doyle is a Staff Platform Engineer @Lob. For the last 20+ years prior she’s been working as a Full Stack Engineer with a focus on the Front-End. She’s also an instructor for Egghead.io and given talks and workshops focusing on building the best and most accessible experiences for users and developers.

Read more
Find Erin Doyle at:

Date

Monday Oct 2 / 10:35AM PDT ( 50 minutes )

Location

Seacliff ABC

Topics

Staff Plus Engineering Incident Management Platform Engineering

Share

From the same track

Session Staff Plus Engineering

Things Every Staff+ Engineer Should Know

Monday Oct 2 / 11:45AM PDT

As staff+ engineers, we're often thrown into the deep end and expected to navigate huge amounts of ambiguity including ambiguity about what our jobs even are. It's common to feel like there's a huge amount of trial and error or even luck.

Speaker image - Joy Ebertz
Joy Ebertz

Principal Engineer @Split, Blogger, and Speaker, Previously @Box

Session Staff Plus Engineering

Managing Staff+ Engineers: Opportunities and Challenges

Monday Oct 2 / 01:35PM PDT

Staff+ engineers can be a powerful force in your organization…if you let them. Effectively managingStaff+ engineers requires different strategies than many managers are used to employing.

Speaker image - Adam Schirmacher
Adam Schirmacher

Staff Engineer & Manager of Staff Engineers @Gusto

Session Staff Plus Engineering

Risk and Failure on the Path to Staff Engineer

Monday Oct 2 / 03:55PM PDT

Even as inaction can be a risk, deciding which actions to take in one's career involves choosing between different bets. I've developed a rubric for judging specific risks that I've taken.

Speaker image -  Caleb Hyde
Caleb Hyde

Site Reliability Engineer @Expel

Session

Unconference: Staff+ Engineering

Monday Oct 2 / 05:05PM PDT

What is an unconference? An unconference is a participant-driven meeting. Attendees come together, bringing their challenges and relying on the experience and know-how of their peers for solutions.

Session

Panel: Staff+ Engineering Skills

Monday Oct 2 / 02:45PM PDT

Staff+ engineering is a critical role in any high-performing engineering organization. But what does it take to get promoted or get hired into a staff role? What does it take to keep it? Join us for a panel discussion with experienced Staff+ engineers who will share their insights.

Speaker image - Erin Doyle
Erin Doyle

Staff Platform Engineer @Lob, with 20+ Years Previously as a Full Stack Engineer and Instructor @Egghead

Speaker image - Adam Schirmacher
Adam Schirmacher

Staff Engineer & Manager of Staff Engineers @Gusto

Speaker image - Joy Ebertz
Joy Ebertz

Principal Engineer @Split, Blogger, and Speaker, Previously @Box

Speaker image -  Caleb Hyde
Caleb Hyde

Site Reliability Engineer @Expel