Phil Steitz (@psteitz) — bluesky.baby

This is telling coming from a leading LLM researcher.

14.03.2026 02:26 👍 1 🔁 0 💬 0 📌 0

A tweet that reads “The new academic wealth gap isn't your university. It's not even your advisor's connections. It's who knows Claude can turn 50+ research papers into a thesis chapter in 3 hours and who's still manually coding qualitative data. I just watched a sociology PhD skip 8 weeks of analysis. Here are the 9 prompts they used:”

fuck this so much

we are so not prepared for the mountain of dogshit “scholarship” about to flood journals everywhere

13.03.2026 14:16 👍 3044 🔁 529 💬 120 📌 296

2/ But I have one good argument left. Today, I had a parameter optimization problem where I couldn’t observe the function and needed to use human raters who could only say which inputs should give higher values. If I did not know how RLHF worked I would not have “seen this problem before.”

13.03.2026 01:40 👍 3 🔁 0 💬 0 📌 0

1/ I am constantly droning on about how AI ppl need to understand what their models are doing. I am starting to lose the battle when it comes to model selection - “I can just ask Claude and it will select the right model for me, so stfu about me needing to understand what it is doing.”

13.03.2026 01:34 👍 3 🔁 1 💬 1 📌 0

Let me ask the question that is going to have me burn on the pyre: why do we need models to learn good representations already? Why does it matter?

11.03.2026 07:22 👍 8 🔁 1 💬 1 📌 1

My take on possible oil scenarios & conflict in Middle East. Inspired by Jurassic Park and Jeff Goldblum’s explanation of chaos theory. This one hit close to home as I know too well tail risks. My mom cried - should have warned her first. Lots of humility amidst uncertainty.

10.03.2026 22:18 👍 22 🔁 4 💬 1 📌 1

I really hope that this does not turn out to be a Claude targeting mistake. We will likely never know, but this is precisely the reason that Dario did not want Anthropic tech used without humans in the loop.

08.03.2026 19:08 👍 0 🔁 0 💬 0 📌 0

Wow

08.03.2026 16:53 👍 2 🔁 0 💬 0 📌 0

generative-ai/gemini/agents/always-on-memory-agent at main · GoogleCloudPlatform/generative-ai Sample code and notebooks for Generative AI on Google Cloud, with Gemini on Vertex AI - GoogleCloudPlatform/generative-ai

This looks interesting. I have always hated vector databases because they force stock embeddings.

github.com/GoogleCloudP...

07.03.2026 23:23 👍 0 🔁 0 💬 0 📌 0

A little faded, but still there.

I thought about taking the flag down today out of shame. But then I worried about what my neighbors would think - not in the way one normally uses that phrase. I was afraid that they would think that I had given up. I am not going to give up on my neighbors.

07.03.2026 17:46 👍 4 🔁 0 💬 1 📌 0

The Illusion of Building AI makes it dramatically cheaper to produce software that appears to work. But 'building an app' and 'engineering a system' are two different activities that people keep confusing, and the gap between...

uphack.io/blog/post/th...

This is excellent.

07.03.2026 01:48 👍 3 🔁 1 💬 0 📌 0

A Comparison of DeepSeek and other LLMs Recently, DeepSeek has been the focus of attention in and beyond the AI community. An interesting problem is how DeepSeek compares to other large language models (LLMs). There are many tasks an LLM...

www.tandfonline.com/doi/full/10....

Saw this in dead trees version which is ancient by now, but the approach is cute. Claude Sonnet 3.5 smokes same epoch other models on 0-shot slop classification. Error rates are not impressive though.

07.03.2026 01:23 👍 0 🔁 0 💬 0 📌 0

Welcome | Apache Otava

otava.apache.org

Great to see leading performance engineering tools coming to the ASF

07.03.2026 01:16 👍 2 🔁 2 💬 1 📌 0

Rattler security

After clearing 2 rodent nests there, we were actually happy to find “security” under the AC unit yesterday. Loud rattle is a plus.

06.03.2026 17:30 👍 1 🔁 0 💬 0 📌 0

When I get bored with science, it's because I've read the same paper three times from different authors whose media diet consists of corporate PR releases and unhinged slop summaries for viral arxiv drops.

05.03.2026 18:11 👍 27 🔁 1 💬 0 📌 1

I don’t think it is so much devaluing as over-simplifying. Claude can do amazing things with code because the language is so simple and good prompts can be translated into that low-dimensional space easily. The mistake is thinking that, like (some) code geniuses, that must mean it can think.

04.03.2026 18:33 👍 1 🔁 0 💬 0 📌 0

It’s hard to image any leadership job in any company where Trump’s no plan, grab-ass leadership style would not get you fired. This is so embarrassing.

04.03.2026 18:13 👍 1 🔁 0 💬 0 📌 0

Go work with Ted. You will learn something.

04.03.2026 00:34 👍 4 🔁 2 💬 2 📌 0

AI could prevent construction delays before they happen What if a construction project could rewrite its own schedule the moment a problem appears? A new peer-reviewed study from the University of East London (UEL) suggests that artificial intelligence could make this possible—detecting emerging risks and automatically adjusting project plans before delays spread across a site.

Integrating AI-driven risk detection with scheduling systems could enable construction projects to automatically adjust plans in response to emerging issues, potentially reducing delays and improving resilience.

03.03.2026 15:00 👍 1 🔁 2 💬 0 📌 0

I am tempted to try the problem :)

03.03.2026 21:16 👍 0 🔁 0 💬 0 📌 0

Beautiful day for a run

01.03.2026 16:30 👍 3 🔁 0 💬 0 📌 0

The Man Who Stole Infinity | Quanta Magazine In an 1874 paper, Georg Cantor proved that there are different sizes of infinity and changed math forever. A trove of newly unearthed letters shows that it was also an act of plagiarism.

www.quantamagazine.org/the-man-who-...

This article really bugs me. Missed opportunity to explain the arguments. Smear piece distracting from what must have been a great collaboration. I expect more from Quanta.

28.02.2026 21:49 👍 0 🔁 0 💬 0 📌 0

Cognitive Debt: When Velocity Exceeds Comprehension | rockoder A systems analysis of how AI-assisted development creates a gap between output speed and understanding, and why organizations cannot see it happening.

www.rockoder.com/beyondthecod...

28.02.2026 19:13 👍 0 🔁 0 💬 0 📌 0

Verified Spec-Driven Development Verified Spec-Driven Development. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com/dollspace-ga...

We can actually do this kind of thing now. Let’s harvest the AI surplus to improve software quality.

28.02.2026 18:42 👍 0 🔁 0 💬 0 📌 0

How to delete your account | OpenAI Help Center Delete your account by submitting a request to our Privacy Portal or within ChatGPT directly.

help.openai.com/en/articles/...

28.02.2026 18:31 👍 1 🔁 0 💬 0 📌 0

Any sufficiently large k-nn is indistinguishable from magic
🧙‍♂️

28.02.2026 06:43 👍 22 🔁 4 💬 0 📌 0

When a company in an industry built on hype tells you that a use case is a bad idea—and actually dangerous—that means it’s a *catastrophically* bad idea.

28.02.2026 02:36 👍 1626 🔁 473 💬 20 📌 13

While whiskey Pete plays army with Grokk and shoots at speedboats, our defenses against today’s actual imminent threats fall into disrepair.

28.02.2026 02:43 👍 0 🔁 0 💬 0 📌 0

The U.S. spent $30 billion to ditch textbooks for laptops and tablets: The result is the first generation less cognitively capable than their parents | Fortune Neuroscientist Jared Cooney Horvath said older generations “screwed up” giving students access to so much technology: “I genuinely hope Gen Z quickly figures that out and gets mad.”

Now THAT's a headline.

"The U.S. spent $30 billion to ditch textbooks for laptops and tablets: The result is the first generation less cognitively capable than their parents"

fortune.com/2026/02/21/l...

21.02.2026 20:40 👍 2633 🔁 1151 💬 80 📌 229

Did ChatGPT write this talking point for you, Sam?

Or do you just *organically* suck this bad?

21.02.2026 20:27 👍 111 🔁 9 💬 11 📌 1

Phil Steitz

Latest posts by Phil Steitz @psteitz