Randall Bennett (@randallb.com)

The funniest part of this whole "CEOs who code" thing is that we're about 6-12 months from CEOs who just act like CEOs, and coders who act like CEOs.

Code is downstream of intent. Anything a CEO does that isn't communicating intent is a waste of time.

09.03.2026 21:41 👍 0 🔁 0 💬 0 📌 0

The AI safety people might be full of crap a lot of times, but I'm really grateful they exist because I think AI is in a much better position than it would have been without them.

Anthropic and OpenAI both existing (I think) is net good for the world.

09.03.2026 14:49 👍 0 🔁 0 💬 0 📌 0

More than ever: Since you can create anything, create your own thing first so you can understand what's good or bad.

Unless, you have an urgent need and don't need to understand. Then use the easy thing.

But knowing how something works means you'll always have better taste.

06.03.2026 18:16 👍 0 🔁 0 💬 0 📌 0

If true, this means that model training should actually avoid internal knowledge and focus on problem solving.

IDK how you'd do that, but it seems like the right place to end up.

27.02.2026 14:27 👍 0 🔁 0 💬 0 📌 0

We should think of LLMs the same way we think of neuroscience:

Context window is short term / immediate consciousness.
Model training is instincts.
Long-term memory is tool calls.

27.02.2026 14:27 👍 0 🔁 0 💬 1 📌 0

i will build a first version of any llm app for free. reach out: help@boltfoundry.com

17.02.2026 14:46 👍 0 🔁 0 💬 0 📌 0

My new hobby: Deleting most of the text of prompts and getting the output i was going for in the first place.

16.02.2026 22:42 👍 0 🔁 0 💬 0 📌 0

Clearly communicating what you want, and don't want, is how to drive models into doing what you're looking for without having to watch over their shoulder.

16.02.2026 05:48 👍 0 🔁 0 💬 0 📌 0

Full post:
open.substack.com/pub/randall...

13.02.2026 17:52 👍 0 🔁 0 💬 0 📌 0

AAR template.md GitHub Gist: instantly share code, notes, and snippets.

Templates:
gist.github.com/randallb/ac...

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Context
Intent
What actually happened (facts only)
Delta analysis (why it was different)
Initiative Assessment (When the AI made its own decisions)
Weaknesses in intent (Parts where the intent wasn’t clear enough)
What we will sustain
What we will improve

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Improve through structured feedback means keeping a "policy" folder, and running an "After Action Review" which is structured like this:

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Try to find any runbooks or policies that apply to your work, and make sure you follow them. You can do this! Good luck!

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

let's implement this. Use your best judgement. Make sure to run through the verification steps thoroughly. I'm not going to be around, so prioritize using your best judgement, making frequent commits, and but don’t submit them, keep them local.

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Decentralized execution means you give your AI the intent and then copy / paste this text:

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Centralizing intent is defining these categories in a file:

Purpose
End State
Constraints
Tradeoffs
Risk tolerance
Escalation conditions
Verification Steps
Activation / Revalidation

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

You can one-shot any feature (or at max 3-shot) in agentic coding if you use this formula I call product command.

1/ Centralize intent
2/ Distribute execution
3/ Improve through structured feedback

13.02.2026 17:52 👍 0 🔁 0 💬 1 📌 0

Here's my product command cheat prompt:

Purpose:
End State:
Constraints:
Tradeoffs:
Allowed Changes:
Risk Tolerance:
When to escalate:
Testing + Verification plan:

Ask your agent to fill this out when they're about to do something. If you disagree, you're not aligned.

11.02.2026 09:02 👍 0 🔁 0 💬 0 📌 0

Intent matters. Intent is king. You cannot do what I attempt by accident. You must mean it. This seems a much greater law than we've ever before understood. --Rhythm of War, endnotes Brandon Sanderson

09.02.2026 16:58 👍 0 🔁 0 💬 0 📌 0

I don't love production engineering, but i've worked with some really great production engineers (at FB mostly) and I think I'd rather hire my LLM / coding agent to do a 50-80% job of that work so I can focus on creativity and product dev.

20.01.2026 07:02 👍 0 🔁 0 💬 0 📌 0

Obviously things like Nix, or other repeatability-first tools are in a similar boat, but with things like DNS credentials, or provisioning boxes, having an LLM be able to do that means that when you hit corners of your stack, an LLM might be able to get you some milage.

20.01.2026 07:02 👍 0 🔁 0 💬 1 📌 0

Our team is only 3 people, but I recently started using Terraform as part of our infra. I don't use it for a lot of things, mostly DNS, but infrastructure as code (IaC) takes tribal prod-ops knowledge and codifies it in a way where it's really accessible to an LLM.

20.01.2026 07:02 👍 0 🔁 0 💬 1 📌 0

Gambit demo video - Launch edition If you’re not familiar, agent harnesses are sort of like an operating system for an agent… they handle tool calling, planning, context window management, and...

Walkthrough video: youtu.be/J_hQ2L_yy60

16.01.2026 00:26 👍 0 🔁 0 💬 0 📌 0

GitHub - bolt-foundry/gambit: Agent harness framework for building, running, and verifying LLM workflows Agent harness framework for building, running, and verifying LLM workflows - bolt-foundry/gambit

We’ll be around if ya’ll have any questions or thoughts. Thanks for checking us out!

github.com/bolt-foundr...

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

- Rubric based grading to guarantee you (for instance) don’t leak PII accidentally - Spin up a usable bot in minutes and have Codex or Claude Code use our command line runner / graders to build a first version that is pretty good w/ very little human intervention.

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

We’re really happy with how it’s working with some of our early design partners, and we think it’s a way to implement a lot of interesting applications:

- Truly open source agents and assistants, where logic, code, and prompts can be easily shared with the community.

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

We know it’s missing some obvious parts, but we wanted to get this out there to see how it could help people or start conversations.

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

Prior to Gambit, we had built an LLM based video editor, and we weren’t happy with the results, which is what brought us down this path of improving inference time LLM quality.

16.01.2026 00:26 👍 1 🔁 0 💬 1 📌 0

We also have test agents you can define on a deck-by-deck basis, that are designed to mimic scenarios your agent would face and generate synthetic data for either humans or graders to grade.

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

Additionally, each step of the chain gets automatic evals, we call graders. A grader is another deck type… but it’s designed to evaluate and score conversations (or individual conversation turns).

16.01.2026 00:26 👍 0 🔁 0 💬 1 📌 0

Randall Bennett

Latest posts by Randall Bennett @randallb.com