brendan chambers (@societyoftrees)

Common Corpus just breaking 1M downloads: it took some time but open data in ai is actually popular.

11.03.2026 19:57 👍 58 🔁 4 💬 3 📌 1

A little offended Grammarly didn't make a sloppelganger of me

10.03.2026 20:55 👍 1531 🔁 191 💬 30 📌 92

GitHub - karpathy/autoresearch: AI agents running research on single-GPU nanochat training automatically AI agents running research on single-GPU nanochat training automatically - karpathy/autoresearch

“autoresearch” micro teaching repo from Karpathy

readme edits seem like such a nice dx for open ended hparam tuning, and maybe other kinds of hill climbing too, so much less painful than the old days

github.com/karpathy/aut...

10.03.2026 17:26 👍 0 🔁 0 💬 0 📌 0

It was all about spying on Americans: www.theatlantic.com/technology/2...

02.03.2026 01:33 👍 49 🔁 10 💬 2 📌 0

FlashSampling: Fast and Memory-Efficient Exact Sampling

Paper: flashsampling.github.io/FlashSamplin...

01.03.2026 07:27 👍 15 🔁 2 💬 0 📌 0

We analyzed 250K+ queries & 430K+ clickstream interactions from Asta, our AI-powered research assistant—and today we're releasing the full dataset. How do researchers actually use AI science tools? Here's what we found. 🧵

27.02.2026 17:56 👍 23 🔁 6 💬 1 📌 1

27.02.2026 01:53 👍 238 🔁 40 💬 3 📌 2

Permissioned Data Diary 2: Buckets The second in a series of posts building up a solution to permissioned data on atproto. We introduce buckets: a new protocol primitive for creating a shared social context.

new blog post on permissioned data in atproto! this one introduces "buckets", the protocol-level primitive for shared access control. I walk through two approaches that don't quite work and land on something that I think does

let me know your thoughts!

26.02.2026 18:12 👍 286 🔁 57 💬 19 📌 21

tldr iiuc we are once again enclosing the commons and industrializing craft, dispossessing laborers while apotheosizing capital, and to slow down this doomloop we need to innovate new collectives and public goods

24.02.2026 18:33 👍 1 🔁 0 💬 0 📌 0

The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models Decoder-only language models have the ability to dynamically switch between various computational tasks based on input prompts. Despite many successful applications of prompting, there is very limited...

This has a very cool result on in-context learned classification tasks, where they disentangle representational quality (how well-separated concept labels are) and readout alignment (how good it is at reading out its own inner labels). Adding demo examples helps through readout, not representations!

23.02.2026 20:01 👍 36 🔁 5 💬 1 📌 0

Designing around the tight bottleneck on latency and throughput that separates local and cloud compute is such an interesting problem. Significant challenges though

20.02.2026 16:52 👍 1 🔁 0 💬 0 📌 0

Anti-homeless benches in Pokemon Legends ZA

why is there anti-homeless architecture in pokemon

15.02.2026 00:51 👍 2476 🔁 341 💬 58 📌 29

Data Centers Ditching the Power Grid, Mark Carney's Viral Speech, and Some Joy Here are some trends I'm following

A year ago, data center developers were focused on connecting to the grid. Today roughly 1/3 of all planned capacity is onsite power - and 72% of that planned capacity is fossil gas. Homer City PA's data center project could soon be one of the largest single sources of carbon emissions in the US.

31.01.2026 16:13 👍 69 🔁 42 💬 5 📌 9

CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data Language identification (LID) is a fundamental step in curating multilingual corpora. However, LID models still perform poorly for many languages, especially on the noisy and heterogeneous web data of...

Announcing our latest paper: CommonLID

In collaboration with @commoncrawl.bsky.social @mlcommons.org @jhu.edu we built a LID benchmark on actual Common Crawl text covering 109 languages. Existing evaluations overestimate how well LangID works on web data.

arxiv.org/abs/2601.18026

13.02.2026 19:27 👍 22 🔁 12 💬 1 📌 0

warning: earnestpost
thanks Caleb

11.02.2026 17:57 👍 1 🔁 0 💬 0 📌 0

extremely poor safekeeping of a student’s private data

as a tech worker I think it’s very disturbing to see Google endangering its own users

11.02.2026 16:33 👍 0 🔁 0 💬 0 📌 0

working on a seven thousand layer model of extended claugenition

08.02.2026 22:46 👍 76 🔁 7 💬 5 📌 1

Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals' knowledge states and understand their intentions. In comparison, our closest ...

New work by my former PhD student, Boyang Li

His team produced 500 stories of less than 100 words. LLMs were basically chance-level at answering binary questions about the stories

arxiv.org/abs/2601.12410

04.02.2026 00:36 👍 119 🔁 15 💬 6 📌 14

This is a real banger of a paper. The example of a model being weirdly focused on jasmine (lol) makes me increasingly think that single-point-of-access models don't really consider who their audience is. Jasmine is a super legible cultural marker for people outside, but is so, _so_ generic.

03.02.2026 16:41 👍 12 🔁 4 💬 2 📌 0

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the le...

This was a colossal multi-year effort driven by an incredible team that gave this everything: Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, Zico Kolter. Much more in the paper! arxiv.org/abs/2601.03220 7/7

07.01.2026 17:27 👍 22 🔁 1 💬 1 📌 1

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Large-scale autoregressive models pretrained on next-token prediction and finetuned with reinforcement learning (RL) have achieved unprecedented success on many problem domains. During RL, these model...

Well this is exciting: arxiv.org/abs/2512.20605

06.01.2026 19:53 👍 54 🔁 7 💬 1 📌 0

the reason I'd follow Cat Hicks into hell is this unswerving humanist conviction that actually

people are going to do the best they can

we can help them do even better

and neither avenue is served by thinking less of people

03.01.2026 23:13 👍 79 🔁 9 💬 3 📌 0

39C3 - From Silicon to Darude Sand-storm: breaking famous synthesizer DSPs YouTube video by media.ccc.de

i think we are about to experience an explosion of the possibilities in reverse engineering

02.01.2026 19:38 👍 48 🔁 3 💬 2 📌 0

we’re at a fascinating moment where I am still ~better at programming than Claude at a medium-horizon difficulty task, but Claude has me absolutely beat in terms of cognitive fatigue so we’re able to ship so much more stuff I never would’ve gotten around to before

02.01.2026 20:56 👍 99 🔁 4 💬 2 📌 0

Great list of models in 2025 👏🏽

02.01.2026 17:14 👍 3 🔁 1 💬 0 📌 0

arXiv AI/ML Catch-Up Was your New Year's resolution to keep up with arXiv AI/ML preprints? Browse the past week's new uploads in 30 mins.

I uh, made this. It was supposed to be a joke / concept-art thing that scrolls through the torrent of new AI/ML arXiv uploads too fast to read. But I think I iterated too much and made it almost usable.

01.01.2026 23:45 👍 78 🔁 13 💬 7 📌 3

Everyone’s favorite feed is running on one person’s gaming system. I love how hackable this site is, it makes it much more fun.

26.12.2025 18:47 👍 22 🔁 1 💬 3 📌 0

If you’re working on a non-fiction research/writing project that isn’t journalism and you don’t have an academic affiliation, how do you find other people who are doing the same thing? Ideally locally (I’m in NY).

22.12.2025 01:53 👍 4 🔁 1 💬 0 📌 0

Owning group data Thinking about how communities can manage shared data on and off ATProto

local first vs atproto!! what should the source of truth for group data be?

20.12.2025 12:45 👍 37 🔁 11 💬 1 📌 1

I am late to the game but I finally read the NeurIPS 2025 best paper on gating in LLMs, it is great.

Qiu et al.
Alibaba, U Edinburg, Stanford, MIT, Tsinghua U
arxiv.org/abs/2505.06708

1/3

15.12.2025 16:42 👍 12 🔁 3 💬 1 📌 0

brendan chambers

Latest posts by brendan chambers @societyoftrees