Finally some systems perspective
youtube.com/shorts/eQ54e...
Half man half straw man.
"Let's push hard that elephant of inventory downstream until we shove it down the customer's throat, even if that means removing essential parts of the process like someone understanding the implementation we're deploying", but "let's reduce the rate at which we generate that inventory".
Instead, we reduce the size of the elephant by limiting the upstream demand feeding the elephant.
So, "LLM generated so much code for review" is not necessarily addressed with →
When there's an elephant stuck in a boa constrictor (accumulated inventory of partially done work in the system), pushing it hard until it gets out of the boa constrictor is not the only way to unclog the system.
I'd say it's actually one of worst ways to go about it.
Once you cross them, it also doesn't mean that it takes the same amount of effort to bring the system to the previous state. Often it takes exponentially more effort, and sometimes they are not reversible at all (runaway condition).
All of them have thresholds (tipping points), and a sufficiently large change in the rate of flows in the system can easily bring about their triggering. Unfortunately, those boundary conditions only become visible once they are crossed, which by definition means too late.
These problems might be qualitatively, directionally the same, but their marginal increase in magnitude (quantity) doesn't mean dynamics of the system remains unchanged.
No system is infinitely resilient.
"We are anyway deploying to production code and libraries we don't own and whose inner workings we don't understand" is not an argument justifying shipping to production even the code we are creating without understanding it.
"Async PR reviews often end up with just LGTM-stamping anyway" is not an argument justifying doing even less of code reviews by going from at least one person per PR understanding the implementation to zero.
"On average, developers anyway create mediocre code" is not an argument justifying generating more of it at, at least, a magnitude higher rate.
If we started referring to GenAI/LLM as "a big recommendation engine" instead (which I think it essentially is), I wonder how our perception of what it does, its capabilities and how we use it would change.
One thing that AI-hype has taught me is that there are far more techno-fixers out there than I thought.
"We can't afford understanding the code we're deploying to production because that way we can't keep up with the pace at which LLMs generate code."
That's a tail wagging the dog instead of the opposite.
I think about this Tony Benn speech much more than I used to
No recording from my side unfortunately
I then compared it to my solution and asked another agent to read my design criteria and rate the two.
The results were quite bad for the generated code to say the least.
Same thing happened with a newly created TDD kata.
What I found more interesting is asking one agent to generate a refactoring kata that doesn't exist, and then asking another one to refactor it.
I made it use all the skills I use: simple design, tell don't ask, baby steps, micro-commits, etc.
Gilded Rose is fairly popular and the solution Opus 4.5 generated I find is quite good having in mind all the solutions out there it ingested during the training. Same for the Mars Rover kata I tried.
I can feel their vibe...
🤣
I heard you can just add a claude skill for it
…or understood.
(if I was to give them the benefit of the doubt)
No, no. Collective, human, just-in-time judgment as part of the process of building a shared mental model.
There’s no machine replacement for that.
+ a collective judgment and care about working code we get through pair/mob programming is gone with the hyper-individualism we get with the agentic coding.
It's at least weird seeing folks that were all up for working together pushing completely to the other end of the spectrum with LLMs.
"We need to remove humans from the review process because of the huge pile of the code generated by LLMs, and thus you won't be able to reason about the system anymore, but when something in production breaks, and you want to know what's going on, you can ask this thing that can't tell the truth."
It's impossible to tell if an approach is faster or slower compared to another if you don't define where the finish line is.
It was sad seeing half of the internet going down
because of an AWS or Cloudflare outage.
I'm guessing it's going to be even sadder seeing half of the orgs/teams not being able to reason about the systems their agentic AI built when LLM providers go down.
In which sense?
I feel that the everlasting tension between Product and Engineering where one was pushing for more features and the other trying to counter by advocating for a sustainable pace got resolved with GenAI.
But in the wrong direction.
More features/code liability/complexity instead of thinner slicing.