Brian Kent (@briankent)

I'm sadly not at #ACL2025, but the work on tokenization seem to continue to explode. Here are the tokenization related papers I could find, in no particular order. Let me know if I missed any.

30.07.2025 14:03 👍 11 🔁 4 💬 2 📌 0

How to track LLM costs with LiteLLM – Brian Patrick Kent A more complete code example for LLM cost visibility with LiteLLM.

LiteLLM does an impressive job tracking meter prices for a wide variety of LLMs, but their documentation is a bit thin about how to use that info. Here's a short example of how I use a CustomLogger class to track costs across multiple LLM calls.

www.crosstab.io/articles/lit...

29.06.2025 19:14 👍 3 🔁 0 💬 0 📌 0

Large AI models are cultural and social technologies Implications draw on the history of transformative information systems from the past

www.science.org/doi/10.1126/...

17.06.2025 07:58 👍 1 🔁 0 💬 0 📌 0

Excellent, definitely a 📌. The argument for social science measurement is even stronger, in light of Farrell, et al.'s March paper that we should view large models "as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated."

17.06.2025 07:58 👍 2 🔁 0 💬 1 📌 0

Anyone got a good alternative to Pocket as a read later / stash a copy of an article tool?

13.06.2025 23:13 👍 63 🔁 7 💬 43 📌 3

Agentic Coding Recommendations Current recommendations of agentic coding.

I'll promise I will shut up about AI soon, but since so many asked I wrote down my agentic flow and also why I'm all the sudden writing Go. lucumr.pocoo.org/2025/6/12/ag...

12.06.2025 09:11 👍 50 🔁 7 💬 11 📌 6

A Claude Sonnet 4 regression – Brian Patrick Kent Claude Sonnet 4 seems to be a clear step backward in its ability to follow instructions regarding the format of code output.

Claude 3.7 Sonnet followed my text-to-SQL instructions flawlessly, but Claude Sonnet 4 just can't seem to get it right.

www.crosstab.io/articles/cla...

06.06.2025 07:03 👍 0 🔁 0 💬 0 📌 0

The flip side of this is that if you *are* behind it's never been easier to jump in and start swimming.

01.06.2025 07:54 👍 1 🔁 0 💬 0 📌 0

The housing theory of everything - Works in Progress Magazine Western housing shortages drive inequality, climate change, low productivity growth, obesity, and even falling fertility rates.

Unfortunately housing theory of everything is correct and you can't unsee it once you see it:
worksinprogress.co/issue/the-ho...

29.05.2025 23:18 👍 60 🔁 14 💬 2 📌 2

It's like it assumes it's running in fully autonomous mode in my IDE

27.05.2025 06:54 👍 0 🔁 0 💬 0 📌 0

Is it just me or does Claude 4 Sonnet seem super overeager with code in the chat UI?

I just want to know how some API's output is structured and Claude is giving me hundreds of lines of fuzzy deduplication, error trapping, the whole works.

27.05.2025 06:52 👍 0 🔁 0 💬 1 📌 0

[P] Introducing the Intelligent Document Processing (IDP) Leaderboard – A Unified Benchmark for OCR, KIE, VQA, Table Extraction, and More

Randomly came across this reddit post about a new document processing leaderboard.

So far, structured data extraction from documents is the killer app for VLMs but public benchmarks and leaderboards have been non-existent. Excited to see that changing.

www.reddit.com/r/MachineLea...

22.05.2025 12:31 👍 0 🔁 0 💬 0 📌 0

A DSPy footgun – Brian Patrick Kent Variable names in DSPy signatures must have semantic meaning for your LLM.

DSPy has a lot going for it but obfuscating how prompts are constructed creates problems. Beware the footguns!

www.crosstab.io/articles/dsp...

21.05.2025 19:34 👍 0 🔁 0 💬 0 📌 0

The now viral, incorrect meme that LLMs are just next token predictors is causing so much confusion

21.05.2025 00:12 👍 77 🔁 8 💬 12 📌 8

Plus prompt caching to avoid sending the whole table schema to the LLM on every call and sqlglot to validate the LLM output.

19.05.2025 13:10 👍 0 🔁 0 💬 0 📌 0

I extended @ramikrispin.bsky.social's excellent work to use Claude Sonnet 3.7 to translate natural language data queries into runnable SQL.

Along the way, I showed that Claude can do this even with English questions against a non-English dataset.

www.crosstab.io/articles/llm...

19.05.2025 13:10 👍 3 🔁 2 💬 1 📌 0

Yes! I'm looking forward to the Tal & Claude Sonnet 3.7 renaming of ice cream shops tour. I mean, "Big Spoon", really? What a waste!

09.05.2025 14:28 👍 1 🔁 0 💬 1 📌 0

I think we should all chime in and vote on the names of these joints. They're like little bites of ice cream for the mind that we all get to enjoy from afar.

My favorite so far: John's Water Ice
Runner up: Owowcow

09.05.2025 09:16 👍 3 🔁 0 💬 1 📌 0

The “Paper Skygest” is a total validation of the bluesky thesis. Anyone can build a useful, tunable feed. It’s a bit sparse right now but it’ll be amazing once it takes off fully.

07.05.2025 15:32 👍 25 🔁 5 💬 2 📌 0

What exactly passes for a foundation model these days?

06.05.2025 14:27 👍 0 🔁 0 💬 0 📌 0

On Feral Library Card Catalogs, or, Aware of All Internet Traditions A Guest Post by Cosma Shalizi

The brilliant Cosma Shalizi writing about LLMs is always worth reading:

www.programmablemutter.com/p/on-feral-l...

17.04.2025 16:43 👍 60 🔁 20 💬 3 📌 4

if you're a PhD student or postdoc working at the interface of personality psychology and CS/ML (construed broadly on both sides), and are interested in doing a full-time, remote, 3 - 6 month internship/residency at MidJourney, please DM me some kind of resume or CV-like thing

15.04.2025 19:04 👍 41 🔁 25 💬 2 📌 1

Pre-Training GPT-4.5 YouTube video by OpenAI

Highly recommended

A video of Pre-Training GPT-4.5 by OpenAI (46 minutes)

www.youtube.com/watch?v=6nJZ...

11.04.2025 18:06 👍 15 🔁 6 💬 0 📌 2

Evaluation or Valuation The infinite regress of evaluating large language models

It turns out to be hard to evaluate natural language with natural language. What should we take away from the conundrum of LLM evaluation? www.argmin.net/p/evaluation...

10.04.2025 15:03 👍 34 🔁 12 💬 3 📌 4

The idea that poetics is more central to language than semantics or syntax jumped out to me.

Maybe we need to build a taste-based vocabulary for LLM benchmarks. We have all sorts of terms to describe how art, music, food, etc. make use *feel*. But with LLMs were stuck with "vibes".

11.04.2025 05:41 👍 1 🔁 0 💬 0 📌 0

I've been happy with Neon so far.

09.04.2025 21:16 👍 1 🔁 0 💬 0 📌 0

Llama 4 Acceptable Use Policy Llama 4 Acceptable Use Policy

I don't get it, why does Meta prohibit people in the EU from using Llama 4 models?

www.llama.com/llama4/use-p...

09.04.2025 11:36 👍 0 🔁 0 💬 0 📌 0

Things I read while the algae grew in my fur - April 2025 – Brian Patrick Kent Interesting things I’ve read over the past month.

Kicking the blog back into gear...

www.crosstab.io/articles/202...

08.04.2025 15:54 👍 0 🔁 0 💬 0 📌 0

Sant Cugat és la millor ciutat per viure a Catalunya, segons la Intel·ligència Artificial Sant Cugat comparteix aquest 'honor' amb Girona, Mataró i Reus

Mirror, mirror, on the wall...

www.totsantcugat.cat/actualitat/s...

06.04.2025 16:20 👍 0 🔁 0 💬 0 📌 0

The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation We’re introducing Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with unprecedented context support and our first...

Meta just dropped Llama 4 on a weekend! Two new open weight models (Scout and Maverick) and a preview of a model called Behemoth - Scout has a 10 million token context

Best information right now appears to be this blog post: https://ai.meta.com/blog/llama-4-multimodal-intelligence/

05.04.2025 19:54 👍 25 🔁 10 💬 1 📌 0

Brian Kent

Latest posts by Brian Kent @briankent