The Messy 9 and Coding with AI - A Panel Discussion
A panel discussion on how people actually use AI in their day-to-day software work, with special guests John Allspaw, Sheeri Cabral, Martin Smith and David Woods.
Going Solid
We talk about the seminal safety science paper and how it relates to our software world.
The Year in Resilience w/special guest John Allspaw
We chatted with John about the year in Resilience, incidents, and all the things.
Incident Status: On Hold w/special guest Will Gallego
We talked about whether there should be an “on hold” status for incidents, which also bled into talking about incident severity, and other lies we tell ourselves (vegetables?)
Complex Systems and the Messy Nine w/special guests Dave Woods and John Allspaw
We get to premier a new RE concept from Dave Woods. Come join us for a discussion on Resilience Engineering and the Messy 9.
All the things about Incident Command
We got a question about how to advocate for having an incident command/comms role.
Root Cause Analysis vs. Resilience Engineering w/special guest Lorin Hochstein
Join us as we go deep into the differences between root cause analysis and other perspectives on how to do post-incident learning and analysis.
First Stories/Second Stories
We’re talking about first stories and second stories, and their effects on people involved in incidents.
How (Not) to Introduce Resilience Engineering at Work with special guest Michelle Casey
We talked with Michelle Casey about her most recent blog post for the Resilience in SOftware Foundation, and what it takes to get people at work to latch onto some good ideas from the Resilience Engineering world.
How long should you wait after an incident to do your retro?
Someone wanted to know if we think software should be more like the FAA… we got a little sidetracked by action items, but there’s some advice in here, we think?
Lund University - Academic Theory and Practice
We talked with a distinguished panel of Lund University MSc HFSS alum and current students to learn more about the program and bridging theory and practice.
What’s the ROI on Reliability and Resilience work?
The dreaded question that we all get... and we have some answers for you, sort of?
Runbooks: the Good, Bad and Ugly w/special guest Andrew Hatch
We chatted with Andrew Hatch about runbooks and when they’re terrible, or when they might not suck so bad.
What is an incident? How come no one declare them?
We talked about the politics and trouble with declaring incidents, and how to improve how your organization handles them.
Chaos Engineering w/special guest Casey Rosenthal
We chatted with Casey Rosenthal about what chaos engineering is and how it’s different (or the same) than resilience engineering.
Burnout on Aisle 3
Colette And Clint talk burnout and why resilience engineering sees so much of it.
Resilience, Complexity, and Your Boss a collab w/Punk Rock Safety
We met with the guys at Punk Rock Safety to talk through how to do resilience engineering even if your boss doesn’t get it (yet).
Live From SRECon
We grabbed a room at SRECon and some folks talked to us about resilience engineering!
Teaser Episode - Season 2
We talk about the last season of episodes, and what we’re excited for coming next!
Episode 10 - When They go Full ITIL on You w/special guest john allspaw
You can find John at Adaptive Capacity Labs or his (old) blog at Kitchen Soap.
ITIL is… well, it’s a thing.
Colette’s “You’re surprised it works in the first place” comes from Richard Cook’s brilliant Velocity talk in 2013.
FYI, John wasn’t talking about Franz Kafka, we think he was talking about Apache Kafka. But they are pretty similar, we think.