Matt Wallace (@m9e) — bluesky.baby

There must always be a boogeyman.

26.02.2026 12:19 👍 0 🔁 0 💬 0 📌 0

This has to have been the most momentous GenAI month since 11/22.

The best SOTA frontier models drop, and they're amazing.
4 massive open models drop and they're closer than ever to the lead.
OpenClaw blows up and creates a zeitgeist.

Man, what a time to be alive.

18.02.2026 01:27 👍 0 🔁 0 💬 0 📌 0

It sucks. Saw your blog months before the craze and had been recommending; happy to see someone with the pure joy and intrinsic motivation capturing the lightning. Some hate is a byproduct of reach. 👊 In my teams (selected for love of the work), you have nothing but fans.

18.02.2026 01:24 👍 0 🔁 0 💬 0 📌 0

18.02.2026 01:16 👍 0 🔁 0 💬 0 📌 0

I read the thread and a lot of responses and I have to say there’s one really interesting story twist might’ve been fitting: one where Jackie doesn’t die, and he’s around and in comment on what you’re doing, but he’s out of the game. Foreshadow of Phantom liberty. The contrast painted for you

07.01.2026 07:14 👍 0 🔁 0 💬 0 📌 0

Interesting. I think OpenAI may do something similar. Last night I was watching Codex try to run a `ps` and it was failing. Ultimately, I'm skeptical of these shim approaches; you want them to have tons of horsepower. Running the tool in the sandbox seems better.

21.10.2025 13:36 👍 0 🔁 0 💬 0 📌 0

Love this detail. reuters on the ruling: www.reuters.com/legal/litiga...

25.06.2025 00:41 👍 0 🔁 0 💬 0 📌 0

So not surprised. When the “is ChatGPT‘s behavior changing over time“ paper came out I published a critique, partially because it’s not that the paper was technically wrong, but it was written with ambiguous conclusions misconstrued by the press. The world loves to see AI fail right now. 🤷🏼‍♂️

15.06.2025 08:22 👍 0 🔁 0 💬 1 📌 0

ChatGPT - AI Assistant Demo Analysis Shared via ChatGPT

This conversation with ChatGPT goes to show: garbage in, garbage out chatgpt.com/share/6841e2...

It knows a lot but it won’t second guess you unless you ask/demand it

05.06.2025 18:30 👍 0 🔁 0 💬 0 📌 0

as an AI founder who is ridiculously all in, when I’m like flabbergasted by the temerity of your product I just can’t even…

05.06.2025 18:18 👍 0 🔁 0 💬 1 📌 0

This fits really well with the “putting our national security conversation on telegram is a non-story” line

30.05.2025 10:46 👍 1 🔁 1 💬 0 📌 0

How much faster would the science of large-scale AI advance if we could open-source the *process* of building a frontier model?

Not just the final models/code/data, but also negative results, toy experiments, and even spontaneous discussions.

That's what we're trying @ marin.community

19.05.2025 19:05 👍 9 🔁 4 💬 1 📌 0

Suno is so fucking good. Wow.

12.05.2025 16:47 👍 0 🔁 0 💬 0 📌 0

The year is 2030. nVidia announces new DGX with attached Fusion reactor.

07.05.2025 15:48 👍 0 🔁 0 💬 0 📌 0

Claude did not understand the mission and tried to write his system prompt to Notion. haha!

06.05.2025 16:50 👍 0 🔁 0 💬 0 📌 0

This was using 5 as spec tokens (which is the example in the docs but for llamacpp I've found 5-8 to be a sweet spot).

06.05.2025 13:37 👍 0 🔁 0 💬 0 📌 0

Home setup: rtx3090+4090. testing AWQ Qwen3; both 32B and 32B w/ 4B speculative decoding

TL;DR: ⚠️ on spec in vanilla vllm. w/ a 75% token acceptance generation in batch-4 went from 150t/s (no speculative) -> 50t/s (w/ speculative)

I'd get 400t/s+ max tput w/out speculative, larger batch

06.05.2025 13:36 👍 0 🔁 0 💬 1 📌 0

I thought we’d agreed on paperclips

13.04.2025 00:35 👍 1 🔁 0 💬 0 📌 0

Yes

12.04.2025 03:54 👍 0 🔁 0 💬 0 📌 0

Sorry, what? This seems very specific and I’ve been wondering how to make sense of different reports on quality; lmsys model is literally labeled Maverick though. Are you saying there was a different unreleased version of Maverick?

06.04.2025 22:13 👍 0 🔁 0 💬 1 📌 0

Wild and wonderful watching a keynote demo 100ft wide and being able to literally picture the code in your head. 😁

01.04.2025 17:04 👍 0 🔁 0 💬 0 📌 0

I haven’t used it enough to say but it’s interesting to think that going from 70% success to 85% success is “twice as good”. Asymptotic improvements are still significant.

29.03.2025 06:44 👍 0 🔁 0 💬 0 📌 0

Every schema for content should have a human vs ai flag (or perhaps an enum with human, ai, and then some hybrid roles).

26.03.2025 15:53 👍 0 🔁 0 💬 0 📌 0

Well you’re not going to make a lot of friends with the lawyers but the rest of us are pretty happy.

23.03.2025 00:31 👍 0 🔁 0 💬 0 📌 0

Jensen: "I'd never buy a hopper!"

Azure: "We don't have any Ampere GPUs to turn up even."

Me: 😠

19.03.2025 18:45 👍 0 🔁 0 💬 0 📌 0

640KB ought to be enough for anybody.

15.03.2025 06:55 👍 0 🔁 0 💬 0 📌 0

semianalysis iirc thinks deepseek actually had 1.3B in hardware so not a box of scraps :) But I agree there’s no need or gain long term from an artificial moat. A surgical ban on deepseek api (no gvmt use, requiring contractors/employees to disclose use) would be fine imo

14.03.2025 14:31 👍 0 🔁 0 💬 0 📌 0

💯 to be fair, china has a deserved (I think) reputation for engineering economic success with state policies. But in this case the answer is to optimize more domestically. AI is not some consumable like a car; it is creating a flywheel. We can’t afford to be crippled.

14.03.2025 14:28 👍 1 🔁 0 💬 1 📌 0

Even banning API access here is super sketchy, and if this is meant to say to ban the weights... it's not just a horrible take, it's just anti-competitive lobbying. Not quite as dumb as Hawley's bill - which arguably banned the US from using China's published algos/math - but terrible still.

14.03.2025 14:11 👍 2 🔁 0 💬 1 📌 0

Side note - when I was chiming in on github and actually, I think, triggered gg to start merging this back in, I remember I was doing 32B_Q8 w/ 7B 4KL draft but I think I still had mine set to --draft 5; I will say 7B>>>>3B so far. I may have to play around with using some even smaller drafts

11.03.2025 13:10 👍 0 🔁 0 💬 0 📌 0

Matt Wallace

Latest posts by Matt Wallace @m9e