• Nassim Taleb
• Yaneer Bar-Yam
• Stephen Wolfram
and others.
Together, they cover a wide range of topics,
from complexity, fragility, and risk to ergodicity.
Check it out:
resilienceengineers.github.io/Expert-Repos...
Let me know what you think!
I've put together a collection of approaches
and models from some of the most influential
thinkers in the fields of risk and complexity.
My goal is to expand this over the coming weeks,
creating a reference database I can consult and learn from.
So far, I've included insights from:
Sounds like an expensive crisis management training?
No.
This can be done with your free ChatGPT Account.
If you get the ($20) pro version, you can do even deeper analysis.
I've put together the exact how-to descriptions
including videos & prompt template.
resilience13.gumroad.com/l/AICrisisCo...
The saucy detail, the memo about the ignored safety issue?
Unfortunately leaked to the journalist.
Maybe she won't mention it?
She will, and you have to react live in the interview.
After that, you'll get a transcript of the interview
and have it analysed to learn what went well, and what not.
This will boost your crisisi management skills by 100%
and it is available 24/7, when you need it.
You copy and paste a prompt.
Get a saucy scenario.
Next, you enter the voice mode,
for an interactive crisis communication
interview with an aggressive journalist.
In the language of your choice.
3. Run structured failure simulations.
Don’t just audit success, rehearse coordinated failure.
4. Support a culture of systemic awareness. Encourage
people to flag design-level fragilities, not just execution errors.
What can be done?
1. Map interdependencies. Focus not just on tasks,
but on how functions interact under stress.
2. Introduce slack. Time buffers, reversible decisions,
and fallback procedures reduce propagation risk.
𝗜𝘀 𝘆𝗼𝘂𝗿 𝘀𝘆𝘀𝘁𝗲𝗺 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁𝗶𝘃𝗲𝗹𝘆 𝗰𝗼𝗺𝗽𝗹𝗲𝘅?
• Are there nonlinear, opaque interdependencies across teams or tools?
• Do multiple subsystems interact in unpredictable ways?
If the answer to both is yes, you are operating
in a domain where normal accidents are structurally plausible.
And when they fail, they often fail fast, and wide.
To evaluate your exposure, ask:
𝗜𝘀 𝘆𝗼𝘂𝗿 𝘀𝘆𝘀𝘁𝗲𝗺 𝘁𝗶𝗴𝗵𝘁𝗹𝘆 𝗰𝗼𝘂𝗽𝗹𝗲𝗱?
• Can steps be delayed or reordered without consequence?
• Are buffers (in time, resources, or control) present, or absent?
I have done an in-depth analysis applying Perro's
Normal Accident Theory, which will be shared
in my Newsletter on Saturday.
Many organizations operate internal systems
(IT, logistics, compliance, finance) with
similar structural features.
Spain lost 15 GW of electricity in 5 seconds.
No cyberattack. No operator error. Just a tightly coupled,
interactively complex system reaching its limits.
This sequence was a textbook “normal accident”
scenario: multiple failures occurring nearly
simultaneously and feeding back on each other.
The recent blackout in Spain and Portugal
is a textbook case of what Perrow described.
Failure in the transmission grid → cascade → regional collapse within minutes.
Exactly the kind of tightly coupled, complex system I outlined above.
No circuit breakers = no buffer.
𝗣.𝗦.: In my course "𝘎𝘶𝘪𝘥𝘦 𝘵𝘰 𝘈𝘐 𝘪𝘯 𝘚𝘦𝘤𝘶𝘳𝘪𝘵𝘺 𝘙𝘪𝘴𝘬 𝘔𝘢𝘯𝘢𝘨𝘦𝘮𝘦𝘯𝘵", you will not only learn how to quantify risks with AI (no technical knowledge needed), but also where its limitations are.
DM me " SRM " if you want to join the waitlist.
Don’t blindly trust scores, risk assessments, or ratings.
And never listen to “experts” telling you that you
don’t have to worry because the scores are low.
One blackout will turn the whole calculation on its head.
We don’t know if we were simply lucky,
or if the system is truly resilient.
Also, ask London Heathrow Airport if they’re happy
that the 𝘢𝘷𝘦𝘳𝘢𝘨𝘦 downtime is 0.3 hours,
when they had to shut down for 24 hours.
Lesson of the story:
But I saw experts concluding:
"The Austrian SAIDI score is 0.6 hours, s
ystems are highly resilient."
Well, the reason for this conclusion is that the one
event that defines the whole outcome didn’t materialize this year.
The extreme event — the black swan.
The lower the SAIDI minutes,
the better the electrical reliability.
Everyone familiar with my posts will immediately
see two issues screaming BE CAREFUL:
• Interruptions LAST YEAR
• AVERAGE outage time
Don't get me wrong, this is, in some parts,
a legitimate score.
It measures reliability in the power grid.
You basically divide the sum of all customer interruption
minutes in the past year by the number of
customers during that year.
This represents how long the average
customer experiences an outage.
As long as you measure the wrong thing,
you'll always get a wrong sense of security.
Or if you give more meaning to
a measurement than there is.
Yesterday, I saw a post about the SAIDI score.
𝗦𝗔𝗜𝗗𝗜 = 𝗦𝘆𝘀𝘁𝗲𝗺 𝗔𝘃𝗲𝗿𝗮𝗴𝗲 𝗜𝗻𝘁𝗲𝗿𝗿𝘂𝗽𝘁𝗶𝗼𝗻 𝗗𝘂𝗿𝗮𝘁𝗶𝗼𝗻 𝗜𝗻𝗱𝗲𝘅.
I finally tossed my unused COVID stockpiles—just in time to start hoarding for the tariff crisis.
There are basically 3 options in response to Tariffs:
- changing suppliers,
- pushing business partners to share the burden
- raising prices for Americans.