Everyone wants to use agents for everything, but a set of if/else/for/while statements still wins 9/10 times.
Everyone wants to use agents for everything, but a set of if/else/for/while statements still wins 9/10 times.
I'm adding this label to the software I sell:
"Every line of code in this product was written, reviewed, and approved by a human who understands, guarantees, and stands behind it."
Google: "Our productivity has increased 10% thanks to AI."
Random Internet User: "Skill Issue! We are 100x faster!"
This is a tough one: I'm not sure who to believe here.
I told a client the code behind the project I was selling them was 100% AI-free.
He loved it.
I'll do it again.
I've been uploading screenshots of my investment allocation to ChatGPT and asking where to invest a specific amount.
I've been doing this for about a year.
10/10 recommend.
If I'm building new functionality, I ask Claude to propose new tests.
I've seen so many people simply asking Claude to "verify the code" and hoping that's enough.
Untested code is broken code.
Write. Tests.
2/2
Writing tests is a superpower, especially now with agentic coding.
Every time I ask Claude Code to build something, I get it to run my test suite:
1. It writes the code
2. Runs my test suite
3. Identifies regressions and fixes them
4. Repeats
1/2
Agents are functions. It's much better to think of them as unreliable APIs that will give you the correct answer most of the time.
As soon as I did that, my code became cleaner and much more efficient.
When you anthropomorphize agents, you can't see them for what they are.
This might sound weird, but errors didn't make sense because I was thinking about agents the wrong way.
You should stop treating agents like humans.
This has probably been one of the biggest unlocks for me over the past few months in terms of building agents.
I see people giving agents personalities, goals, and even titles.
But agents aren't people — they are functions with context windows.
At this point, the data is ready for training.
They are processing more than 10M documents, and the pipeline takes several hours to complete.
A significant amount of code was written here, and 0% of it was AI-generated.
4/4
1. Load raw data from 6 different plaves
2. Remove unnecessary tokens
3. Remove low-quality data
4. Remove duplicated data
5. Replace personally identifiable information
6. Normalize data formats
7. Align multi-modal data
8. Enrich the data with domain-specific annotations
9. Tokenize the data
3/4
I wanted to outline the steps to give you an idea of what production applications typically deal with.
Here are the steps in their pipeline:
2/4
I'm working with a huge company on their data preparation pipeline, and they are running a 9-step data processing pipeline.
This all happens before their data is ready for training. There's a lot of code here and considerable complexity.
1/4
I'll try to stick around.
And no other programming language can match Python in code clarity and readability.
Except English, maybe.
On top of that, over 90% of the Python code you interact with uses C/Rust/Fast-thing behind the scenes anyway.
The speed at which you can write good Python code is way more important in most situations than the speed at which that code must run.
Python is the best language in the world.
Yes, Python is slower than other languages, but I don't care because most of my work doesn't require it to be faster.
Python is fast enough for what I need, and fast enough for most people out there.
Gemini's documentation says that "responses for a given prompt are mostly deterministic, but a small amount of variation is still possible."
I'm not sure about you, but "mostly deterministic" is not the same as "deterministic."
9/9
So, in practice, the LLM you are calling is non-deterministic, even at temperature = 0.
OpenAI's documentation tells us that we can expect "(mostly) deterministic outputs across API calls."
8/9
• Your model provider will patch, retrain, and fine-tune the model you are using without telling you. What was true yesterday isn't guaranteed today. (I'm kinda cheating with this last point, because it's not related to the nature of models, but it's still a problem we have to deal with.)
7/9
• If you are using a model scaled across multiple servers, you are not always hitting the same model instance, which increases the chances of finding issues with differences in hardware.
6/9
• Floating-point operations are tricky, and even when using greedy decoding, a model might suffer from tiny numeric drifts that shift the outputs.
5/9
Theoretically, temperature=0 makes a model deterministic. In practice, this is not the case.
There are several reasons:
• Non-deterministic hardware ops can produce different outputs. Two GPU kernels can diverge slightly run to run. This may introduce slight differences in your outputs.
4/9
Yesterday, most replies tried to school me about the temperature parameter when using an LLM. Yes, you can set this parameter to 0 to try to force a model to return the same answer, but in practice, you'll find this isn't guaranteed.
3/9
A deterministic model will always return the same answer given the same input. My argument is that the set of large language models we use today can't fulfill that contract.
2/9
Yesterday, I posted that large language models are non-deterministic, and people went bonkers.
People called me stupid, asked me to go back to school, and used so many other names I can't remember.
Today, I want to elaborate.
1/9
Building software that works is very easy… as long as you ignore all the hard parts.
If someone presents themselves as a "Programmer" and somebody else does it as a "Software Engineer," would you consider them equally capable?
Do you see a fundamental difference between a Programmer and a Software Engineer?
The final 20% of the work always takes 80% of the total effort.