Sung Kim (@sungkim) — bluesky.baby

BM25 There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 is one of them. BM25 was born out of information retrieval research in the 1970s and 1980s, ...

When you need to tune for your domain, the parameters give you meaningful handles to turn. The interpretability is genuinely valuable."

arpitbhayani.me/blogs/bm25

06.03.2026 07:11 👍 2 🔁 1 💬 0 📌 0

BM25 by Arpit Bhayani

"What makes BM25 worth understanding is not just that it works. It is that it works for knowable reasons. Every part of the formula has a clear interpretation. When a result is surprising, you can trace why.

06.03.2026 07:11 👍 5 🔁 1 💬 1 📌 0

A word of wisdom to live by - do not let your luxury possession possess you.

06.03.2026 06:16 👍 8 🔁 2 💬 2 📌 0

So true.

05.03.2026 23:34 👍 19 🔁 1 💬 1 📌 1

My thoughts on gpt-5.4 high on Codex CLI

I have no idea if it is better than gpt-5.3-codex or even gpt-5.2, but it devours tokens like a competitive eater at a Las Vegas buffet.

05.03.2026 21:31 👍 23 🔁 0 💬 4 📌 0

Intel Panther Lake Die Shot

Why does it look like Impressionist painting? BSPDN.

05.03.2026 20:32 👍 35 🔁 1 💬 2 📌 0

FYI

05.03.2026 19:31 👍 5 🔁 1 💬 1 📌 0

Speculative Speculative Decoding (SSD)

It's up to 2x faster than the strongest inference engines in the world, but you need H100 or better GPUs.

Paper: arxiv.org/abs/2603.03251
Repo: github.com/tanishqkumar...

05.03.2026 19:16 👍 16 🔁 2 💬 1 📌 0

PyTorch's FlexAttention also supports FlashAttention-4 backend.

PyTorch now auto-generates CuTeDSL score/mask modifications and JIT-instantiates FlashAttention-4 for your custom attention variant

The result: 1.2× to 3.2× speedups over Triton on compute-bound workloads.

pytorch.org/blog/flexatt...

05.03.2026 18:52 👍 7 🔁 0 💬 0 📌 1

- Paper: github.com/Dao-AILab/fl...
- Code: github.com/Dao-AILab/fl...

- Blogposts:
together.ai/blog/flashat...
tridao.me/blog/2026/fl...
research.colfax-intl.com/flashattenti...

05.03.2026 18:47 👍 3 🔁 1 💬 0 📌 0

FlashAttention-4

I hope it is not pain to work with. It changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed!

05.03.2026 18:47 👍 23 🔁 4 💬 2 📌 3

You can always go to other platforms and browse through 100s or 1,000s of postings and view posts by the original authors.

05.03.2026 16:22 👍 0 🔁 0 💬 0 📌 0

OpenAI's Symphony

A Linear Board for agents.

github.com/openai/symph...

05.03.2026 06:35 👍 15 🔁 1 💬 1 📌 0

Teaching LLMs to reason like Bayesians Google researchers demonstrate how Bayesian teaching through supervised fine-tuning enables LLMs to approximate optimal probabilistic reasoning and generalize to new domains.

Teaching LLMs to reason like Bayesians

By training models to mimic optimal probabilistic inference, they improved their ability to update their predictions and generalize across new domains.

research.google/blog/teachin...

05.03.2026 06:22 👍 50 🔁 2 💬 2 📌 0

No. You can safely assume that I haven’t tried most of the things I post about.

05.03.2026 05:41 👍 0 🔁 0 💬 1 📌 0

Question: Is ColBERT worth it? I am seeing like ~10X increase in latency and ~30X increase in storage, as compared to dense/sparse vectors.

05.03.2026 02:59 👍 2 🔁 0 💬 1 📌 0

I liked it when mkbhd said "If you are watching my video on MacBook Neo then a new MacBook Neo is not a computer for you".

05.03.2026 01:49 👍 4 🔁 0 💬 0 📌 0

What is gog?

05.03.2026 00:44 👍 0 🔁 0 💬 1 📌 0

GitHub - googleworkspace/cli: Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI... Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI agent skills. - googlework...

Google Workspace CLI

github.com/googleworksp...

05.03.2026 00:14 👍 21 🔁 0 💬 1 📌 2

Why The Best AI Engineers Are Former Managers | Tolans.com Our most effective ICs all share a common trait: they’re former managers.

- Coach and redirect agents quickly and effectively as they surface for help
- Conduct thorough code reviews of completed work
- Intervene manually when needed to ship well-tested features"

www.tolans.com/relay/why-th...

04.03.2026 23:58 👍 4 🔁 0 💬 0 📌 0

Why The Best AI Engineers Are Former Managers by Quinten Farmer

"We evaluate Agent Engineering Managers on their ability to:

- Break down ambiguous product problems into well-scoped tasks
delegate those tasks with appropriate milestones and planned checkpoints

04.03.2026 23:58 👍 7 🔁 0 💬 2 📌 0

Source: x.com/hacker_/stat...

04.03.2026 23:13 👍 0 🔁 0 💬 0 📌 0

Claude "the hacker"

04.03.2026 23:13 👍 3 🔁 0 💬 1 📌 0

Sure. For the record, I’ve been consistent in my views and very transparent about them.

04.03.2026 21:23 👍 1 🔁 0 💬 0 📌 0

Newsom likens Israel to ‘apartheid state,’ questions future military support The California governor’s comments came as he promoted his new memoir in Los Angeles.

He’s an opportunistic politician, but it’s still notable to hear a mainstream politician break from AIPAC and call Israel an “apartheid state.”

www.politico.com/news/2026/03...

04.03.2026 20:33 👍 11 🔁 0 💬 2 📌 0

Mark Zuckerberg is 'done with' the Meta’s highest-paid employee as company’s reorganisation proves - The Times of India Tech News News: Meta CEO Mark Zuckerberg has quietly begun dismantling the power structure he built around Alexandr Wang, his $14 billion bet to lead the company's AI.

Interesting development… I guess Alexandr Wang is on the way out. That was a bit quicker than I expected. I would have thought he’d be given at least a year of runway.

timesofindia.indiatimes.com/technology/t...

04.03.2026 19:52 👍 12 🔁 3 💬 2 📌 1

I would defer to the community.

04.03.2026 19:28 👍 1 🔁 0 💬 1 📌 0

GitHub - Yuan-lab-LLM/Yuan3.0-Ultra: Yuan3.0: Mixture-of-Experts (MoE) Language Model Yuan3.0: Mixture-of-Experts (MoE) Language Model. Contribute to Yuan-lab-LLM/Yuan3.0-Ultra development by creating an account on GitHub.

Full weights (16bit/4bit), code, technical report & training details — all free for the community.

github.com/Yuan-lab-LLM...

04.03.2026 19:23 👍 3 🔁 0 💬 0 📌 0

Inspur's Yuan Lab released Yuan 3.0 Ultra - their flagship multimodal MoE foundation model, built for stronger intelligence and unrivaled efficiency.

- Efficiency Redefined: 1010B total / 68.8B activated params
- Smarter, Not Longer Thinking
- Enterprise-Grade Agent Engine

04.03.2026 19:23 👍 11 🔁 0 💬 2 📌 1

FYI: I'm not affiliated with either Meta nor FFmpeg.

04.03.2026 18:55 👍 0 🔁 0 💬 0 📌 0

Sung Kim

Latest posts by Sung Kim @sungkim