Santiago's Avatar

Santiago

@svpino.com

I help companies build Machine Learning • I run http://ml.school. • Posts about what I learn along the way.

5,334
Followers
158
Following
397
Posts
15.03.2023
Joined
Posts Following

Latest posts by Santiago @svpino.com

Everyone wants to use agents for everything, but a set of if/else/for/while statements still wins 9/10 times.

25.10.2025 19:27 👍 10 🔁 0 💬 1 📌 0

I'm adding this label to the software I sell:

"Every line of code in this product was written, reviewed, and approved by a human who understands, guarantees, and stands behind it."

06.10.2025 12:15 👍 6 🔁 1 💬 0 📌 0

Google: "Our productivity has increased 10% thanks to AI."

Random Internet User: "Skill Issue! We are 100x faster!"

This is a tough one: I'm not sure who to believe here.

05.10.2025 12:10 👍 1 🔁 0 💬 3 📌 0

I told a client the code behind the project I was selling them was 100% AI-free.

He loved it.

I'll do it again.

04.10.2025 12:17 👍 4 🔁 0 💬 0 📌 0

I've been uploading screenshots of my investment allocation to ChatGPT and asking where to invest a specific amount.

I've been doing this for about a year.

10/10 recommend.

03.10.2025 19:21 👍 3 🔁 0 💬 1 📌 0

If I'm building new functionality, I ask Claude to propose new tests.

I've seen so many people simply asking Claude to "verify the code" and hoping that's enough.

Untested code is broken code.

Write. Tests.

2/2

02.10.2025 12:15 👍 1 🔁 0 💬 0 📌 0

Writing tests is a superpower, especially now with agentic coding.

Every time I ask Claude Code to build something, I get it to run my test suite:

1. It writes the code
2. Runs my test suite
3. Identifies regressions and fixes them
4. Repeats

1/2

02.10.2025 12:15 👍 3 🔁 0 💬 1 📌 0

Agents are functions. It's much better to think of them as unreliable APIs that will give you the correct answer most of the time.

As soon as I did that, my code became cleaner and much more efficient.

01.10.2025 15:02 👍 6 🔁 0 💬 0 📌 0

When you anthropomorphize agents, you can't see them for what they are.

This might sound weird, but errors didn't make sense because I was thinking about agents the wrong way.

01.10.2025 15:02 👍 1 🔁 0 💬 1 📌 0

You should stop treating agents like humans.

This has probably been one of the biggest unlocks for me over the past few months in terms of building agents.

I see people giving agents personalities, goals, and even titles.

But agents aren't people — they are functions with context windows.

01.10.2025 15:02 👍 5 🔁 1 💬 1 📌 0

At this point, the data is ready for training.

They are processing more than 10M documents, and the pipeline takes several hours to complete.

A significant amount of code was written here, and 0% of it was AI-generated.

4/4

01.10.2025 12:20 👍 2 🔁 0 💬 0 📌 0

1. Load raw data from 6 different plaves
2. Remove unnecessary tokens
3. Remove low-quality data
4. Remove duplicated data
5. Replace personally identifiable information
6. Normalize data formats
7. Align multi-modal data
8. Enrich the data with domain-specific annotations
9. Tokenize the data

3/4

01.10.2025 12:20 👍 2 🔁 0 💬 1 📌 0

I wanted to outline the steps to give you an idea of what production applications typically deal with.

Here are the steps in their pipeline:

2/4

01.10.2025 12:20 👍 2 🔁 0 💬 1 📌 0

I'm working with a huge company on their data preparation pipeline, and they are running a 9-step data processing pipeline.

This all happens before their data is ready for training. There's a lot of code here and considerable complexity.

1/4

01.10.2025 12:20 👍 3 🔁 0 💬 1 📌 0

I'll try to stick around.

30.09.2025 17:36 👍 0 🔁 0 💬 0 📌 0

And no other programming language can match Python in code clarity and readability.

Except English, maybe.

30.09.2025 15:10 👍 1 🔁 0 💬 2 📌 1

On top of that, over 90% of the Python code you interact with uses C/Rust/Fast-thing behind the scenes anyway.

The speed at which you can write good Python code is way more important in most situations than the speed at which that code must run.

30.09.2025 15:10 👍 1 🔁 0 💬 1 📌 0

Python is the best language in the world.

Yes, Python is slower than other languages, but I don't care because most of my work doesn't require it to be faster.

Python is fast enough for what I need, and fast enough for most people out there.

30.09.2025 15:10 👍 2 🔁 0 💬 2 📌 0
Post image

Gemini's documentation says that "responses for a given prompt are mostly deterministic, but a small amount of variation is still possible."

I'm not sure about you, but "mostly deterministic" is not the same as "deterministic."

9/9

30.09.2025 13:04 👍 3 🔁 0 💬 0 📌 0
Post image

So, in practice, the LLM you are calling is non-deterministic, even at temperature = 0.

OpenAI's documentation tells us that we can expect "(mostly) deterministic outputs across API calls."

8/9

30.09.2025 13:04 👍 2 🔁 0 💬 1 📌 0

• Your model provider will patch, retrain, and fine-tune the model you are using without telling you. What was true yesterday isn't guaranteed today. (I'm kinda cheating with this last point, because it's not related to the nature of models, but it's still a problem we have to deal with.)

7/9

30.09.2025 13:04 👍 1 🔁 0 💬 2 📌 0

• If you are using a model scaled across multiple servers, you are not always hitting the same model instance, which increases the chances of finding issues with differences in hardware.

6/9

30.09.2025 13:04 👍 1 🔁 0 💬 1 📌 0

• Floating-point operations are tricky, and even when using greedy decoding, a model might suffer from tiny numeric drifts that shift the outputs.

5/9

30.09.2025 13:04 👍 1 🔁 0 💬 1 📌 0

Theoretically, temperature=0 makes a model deterministic. In practice, this is not the case.

There are several reasons:

• Non-deterministic hardware ops can produce different outputs. Two GPU kernels can diverge slightly run to run. This may introduce slight differences in your outputs.

4/9

30.09.2025 13:04 👍 1 🔁 0 💬 1 📌 0

Yesterday, most replies tried to school me about the temperature parameter when using an LLM. Yes, you can set this parameter to 0 to try to force a model to return the same answer, but in practice, you'll find this isn't guaranteed.

3/9

30.09.2025 13:04 👍 2 🔁 0 💬 1 📌 1

A deterministic model will always return the same answer given the same input. My argument is that the set of large language models we use today can't fulfill that contract.

2/9

30.09.2025 13:04 👍 1 🔁 0 💬 1 📌 0

Yesterday, I posted that large language models are non-deterministic, and people went bonkers.

People called me stupid, asked me to go back to school, and used so many other names I can't remember.

Today, I want to elaborate.

1/9

30.09.2025 13:04 👍 1 🔁 0 💬 1 📌 0

Building software that works is very easy… as long as you ignore all the hard parts.

19.01.2025 14:01 👍 27 🔁 1 💬 1 📌 0

If someone presents themselves as a "Programmer" and somebody else does it as a "Software Engineer," would you consider them equally capable?

Do you see a fundamental difference between a Programmer and a Software Engineer?

15.01.2025 13:15 👍 11 🔁 0 💬 2 📌 0

The final 20% of the work always takes 80% of the total effort.

11.01.2025 18:05 👍 31 🔁 3 💬 5 📌 0