Chris McMaster (@rheumai.com)

LLMs become much less useful the moment you dismantle bureaucracy.

24.03.2025 22:00 👍 2 🔁 0 💬 0 📌 0

The modern internet penalises anyone who thinks for more than 3 seconds before forming a strong opinion. You’ll be happy to know that I only thought for 2 seconds before typing this.

17.03.2025 22:20 👍 1 🔁 0 💬 0 📌 0

The major barrier here is that the massive budgets get spent on AI researcher/engineer compensation and GPUs, leaving very little to pay for the best domain experts. I think this is a massive miscalculation. The first lab to realise this will quickly become the market leader.

28.02.2025 23:09 👍 1 🔁 0 💬 0 📌 0

It should therefore come as no surprise that new models like grok 3 and GPT 4.5 feel like small incremental improvements. The focus now should be on improving post-training data. Frankly, the quality kinda sucks even at the best labs.

28.02.2025 23:07 👍 1 🔁 0 💬 1 📌 0

The idea that scaling up LLMs on human data will produce superhuman performance is magical thinking. The highest attainable performance in any given domain is simply the best human performance. Yes, maybe some insights span across domains, but I don’t think there’s actually evidence for that.

28.02.2025 23:05 👍 2 🔁 0 💬 1 📌 0

The people who laugh about hacky “vibe coding” with LLMs are the same people who think that grok 3 is better than a doctor. Absolutely nuts.

27.02.2025 09:35 👍 2 🔁 0 💬 0 📌 0

“Below is the updated code” followed by absolutely no code is such an o3-mini thing to do that at this point I don’t really understand why this model exists.

24.02.2025 02:14 👍 1 🔁 0 💬 0 📌 0

After much time spent looking at reasoning traces from DeepSeek R1 for medical cases, I have to conclude that there isn’t a strong correlation between good reasoning and a good answer.

31.01.2025 11:15 👍 4 🔁 0 💬 0 📌 0

The really interesting thing is that they’re not all made equal. Llama 8b distilled can solve medical cases that the Qwen-based R1 (and even R1 itself) cannot. World knowledge still matters for solving real problems and no open models beat Llama in that regard.

21.01.2025 07:00 👍 0 🔁 0 💬 0 📌 0

Say the benchmark has 100 questions, generate 64 responses per question and then pass@1 is total number correct / 6400.

21.01.2025 01:58 👍 7 🔁 0 💬 1 📌 0

These R1 distilled models are absolutely amazing on a single turn, but truly horrible on multi-turn conversations.

20.01.2025 23:11 👍 1 🔁 0 💬 1 📌 0

a cartoon dog is sitting at a table with a cup of coffee in front of a fire with the words this is fine . ALT: a cartoon dog is sitting at a table with a cup of coffee in front of a fire with the words this is fine .

Meta’s new response to the Rohingya genocide.

15.01.2025 07:19 👍 2 🔁 0 💬 0 📌 0

"Mildly elevated rheumatoid factor has a very low positive predictive value that is completely overwhelmed in magnitude by the negative predictive value of not having any signs or symptoms of rheumatoid arthritis."

15.01.2025 00:40 👍 1 🔁 0 💬 0 📌 0

My 2 year old has 3 adjectives for the size of things. In increasing order: small, mummy, big.

13.01.2025 21:38 👍 2 🔁 0 💬 0 📌 0

5.7B tokens to solve 100 tasks??? I don’t understand why we’re thinking of this as being incredibly smart, when what this suggests is that it’s incredibly dumb.

21.12.2024 09:52 👍 1 🔁 0 💬 0 📌 0

Finally, a Replacement for BERT: Introducing ModernBERT We’re on a journey to advance and democratize artificial intelligence through open source and open science.

👀👀👀👀

huggingface.co/blog/modernb...

19.12.2024 17:32 👍 76 🔁 11 💬 7 📌 2

Releasing Jupyter Agents - LLMs running data analysis directly in a notebook!

The agent can load data, execute code, plot results and following your guidance and ideas!

A very natural way to collaborate with an LLM over data and it's just scratching the surface of what's possible soon!

19.12.2024 18:56 👍 13 🔁 4 💬 1 📌 0

Did you use oil?

18.12.2024 01:59 👍 0 🔁 0 💬 1 📌 0

What percentage of “rhupus” is just misdiagnosed Sjögren?

17.12.2024 22:03 👍 3 🔁 1 💬 1 📌 0

“ILD, hyperglobulinemia & Lab abnormalities” sounds like SjD to me!

17.12.2024 22:01 👍 0 🔁 0 💬 0 📌 0

I just follow everyone and then spend all my time on the quiet posters feed (except when I want to come judge the yappers)

17.12.2024 10:03 👍 1 🔁 0 💬 0 📌 0

o1 is equal parts brilliant and boring. Very, very boring.

17.12.2024 10:01 👍 1 🔁 0 💬 0 📌 0

Llamafile is a cheat code.

16.12.2024 06:29 👍 0 🔁 0 💬 0 📌 0

"Zuckerberg's eyes brimmed with tears, and his heart felt full. He truly loved Big Brother!"

14.12.2024 17:03 👍 15 🔁 4 💬 1 📌 0

You better believe it. “California boomer” is the vibe I am getting from this guy.

15.12.2024 02:12 👍 0 🔁 0 💬 0 📌 0

Prop stethoscope checks out, though.

15.12.2024 00:42 👍 0 🔁 0 💬 0 📌 0

Sora’s idea of a hand exam. This 60 year-old rheumatologist is giving me very 2nd year medical student vibes with this bizarre technique. No synovitis was detected this day. #rheumsky

15.12.2024 00:35 👍 2 🔁 0 💬 1 📌 0

Oh, and if you follow him then you’ll end up on a list, which will lead to you seeing less of these people. I genuinely have no interest in seeing his posts (I find him uniquely annoying), but it’s probably worth it.

14.12.2024 02:17 👍 16 🔁 0 💬 1 📌 0

Ultimately, if performance is anything like previous iterations of Phi, it will greatly underwhelm outside of benchmarks. So the license has no meaning to me.

13.12.2024 07:45 👍 4 🔁 0 💬 1 📌 0

So much delving

13.12.2024 06:01 👍 2 🔁 0 💬 0 📌 0

Chris McMaster

Latest posts by Chris McMaster @rheumai.com