brendan chambers's Avatar

brendan chambers

@societyoftrees

Ithaca | prev Chicago | interested in interconnected systems and humans+computers | currently: gardening

641
Followers
397
Following
166
Posts
18.10.2023
Joined
Posts Following

Latest posts by brendan chambers @societyoftrees

Post image

Common Corpus just breaking 1M downloads: it took some time but open data in ai is actually popular.

11.03.2026 19:57 πŸ‘ 58 πŸ” 4 πŸ’¬ 3 πŸ“Œ 1

A little offended Grammarly didn't make a sloppelganger of me

10.03.2026 20:55 πŸ‘ 1531 πŸ” 191 πŸ’¬ 30 πŸ“Œ 92
GitHub - karpathy/autoresearch: AI agents running research on single-GPU nanochat training automatically AI agents running research on single-GPU nanochat training automatically - karpathy/autoresearch

β€œautoresearch” micro teaching repo from Karpathy

readme edits seem like such a nice dx for open ended hparam tuning, and maybe other kinds of hill climbing too, so much less painful than the old days

github.com/karpathy/aut...

10.03.2026 17:26 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

It was all about spying on Americans: www.theatlantic.com/technology/2...

02.03.2026 01:33 πŸ‘ 49 πŸ” 10 πŸ’¬ 2 πŸ“Œ 0
Post image

FlashSampling: Fast and Memory-Efficient Exact Sampling

Paper: flashsampling.github.io/FlashSamplin...

01.03.2026 07:27 πŸ‘ 15 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Post image

We analyzed 250K+ queries & 430K+ clickstream interactions from Asta, our AI-powered research assistantβ€”and today we're releasing the full dataset. How do researchers actually use AI science tools? Here's what we found. 🧡

27.02.2026 17:56 πŸ‘ 23 πŸ” 6 πŸ’¬ 1 πŸ“Œ 1
Post image
27.02.2026 01:53 πŸ‘ 238 πŸ” 40 πŸ’¬ 3 πŸ“Œ 2
Preview
Permissioned Data Diary 2: Buckets The second in a series of posts building up a solution to permissioned data on atproto. We introduce buckets: a new protocol primitive for creating a shared social context.

new blog post on permissioned data in atproto! this one introduces "buckets", the protocol-level primitive for shared access control. I walk through two approaches that don't quite work and land on something that I think does

let me know your thoughts!

26.02.2026 18:12 πŸ‘ 286 πŸ” 57 πŸ’¬ 19 πŸ“Œ 21

tldr iiuc we are once again enclosing the commons and industrializing craft, dispossessing laborers while apotheosizing capital, and to slow down this doomloop we need to innovate new collectives and public goods

24.02.2026 18:33 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models Decoder-only language models have the ability to dynamically switch between various computational tasks based on input prompts. Despite many successful applications of prompting, there is very limited...

This has a very cool result on in-context learned classification tasks, where they disentangle representational quality (how well-separated concept labels are) and readout alignment (how good it is at reading out its own inner labels). Adding demo examples helps through readout, not representations!

23.02.2026 20:01 πŸ‘ 36 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0

Designing around the tight bottleneck on latency and throughput that separates local and cloud compute is such an interesting problem. Significant challenges though

20.02.2026 16:52 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Anti-homeless benches in Pokemon Legends ZA

Anti-homeless benches in Pokemon Legends ZA

why is there anti-homeless architecture in pokemon

15.02.2026 00:51 πŸ‘ 2476 πŸ” 341 πŸ’¬ 58 πŸ“Œ 29
Preview
Data Centers Ditching the Power Grid, Mark Carney's Viral Speech, and Some Joy Here are some trends I'm following

A year ago, data center developers were focused on connecting to the grid. Today roughly 1/3 of all planned capacity is onsite power - and 72% of that planned capacity is fossil gas. Homer City PA's data center project could soon be one of the largest single sources of carbon emissions in the US.

31.01.2026 16:13 πŸ‘ 69 πŸ” 42 πŸ’¬ 5 πŸ“Œ 9
Preview
CommonLID: Re-evaluating State-of-the-Art Language Identification Performance on Web Data Language identification (LID) is a fundamental step in curating multilingual corpora. However, LID models still perform poorly for many languages, especially on the noisy and heterogeneous web data of...

Announcing our latest paper: CommonLID

In collaboration with @commoncrawl.bsky.social @mlcommons.org @jhu.edu we built a LID benchmark on actual Common Crawl text covering 109 languages. Existing evaluations overestimate how well LangID works on web data.

arxiv.org/abs/2601.18026

13.02.2026 19:27 πŸ‘ 22 πŸ” 12 πŸ’¬ 1 πŸ“Œ 0

warning: earnestpost
thanks Caleb

11.02.2026 17:57 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

extremely poor safekeeping of a student’s private data

as a tech worker I think it’s very disturbing to see Google endangering its own users

11.02.2026 16:33 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

working on a seven thousand layer model of extended claugenition

08.02.2026 22:46 πŸ‘ 76 πŸ” 7 πŸ’¬ 5 πŸ“Œ 1
Preview
Are LLMs Smarter Than Chimpanzees? An Evaluation on Perspective Taking and Knowledge State Estimation Cognitive anthropology suggests that the distinction of human intelligence lies in the ability to infer other individuals' knowledge states and understand their intentions. In comparison, our closest ...

New work by my former PhD student, Boyang Li

His team produced 500 stories of less than 100 words. LLMs were basically chance-level at answering binary questions about the stories

arxiv.org/abs/2601.12410

04.02.2026 00:36 πŸ‘ 119 πŸ” 15 πŸ’¬ 6 πŸ“Œ 14

This is a real banger of a paper. The example of a model being weirdly focused on jasmine (lol) makes me increasingly think that single-point-of-access models don't really consider who their audience is. Jasmine is a super legible cultural marker for people outside, but is so, _so_ generic.

03.02.2026 16:41 πŸ‘ 12 πŸ” 4 πŸ’¬ 2 πŸ“Œ 0
Preview
From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the le...

This was a colossal multi-year effort driven by an incredible team that gave this everything: Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, Zico Kolter. Much more in the paper! arxiv.org/abs/2601.03220 7/7

07.01.2026 17:27 πŸ‘ 22 πŸ” 1 πŸ’¬ 1 πŸ“Œ 1
Preview
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Large-scale autoregressive models pretrained on next-token prediction and finetuned with reinforcement learning (RL) have achieved unprecedented success on many problem domains. During RL, these model...

Well this is exciting: arxiv.org/abs/2512.20605

06.01.2026 19:53 πŸ‘ 54 πŸ” 7 πŸ’¬ 1 πŸ“Œ 0

the reason I'd follow Cat Hicks into hell is this unswerving humanist conviction that actually

people are going to do the best they can

we can help them do even better

and neither avenue is served by thinking less of people

03.01.2026 23:13 πŸ‘ 79 πŸ” 9 πŸ’¬ 3 πŸ“Œ 0
39C3 - From Silicon to Darude Sand-storm: breaking famous synthesizer DSPs
39C3 - From Silicon to Darude Sand-storm: breaking famous synthesizer DSPs YouTube video by media.ccc.de

i think we are about to experience an explosion of the possibilities in reverse engineering

02.01.2026 19:38 πŸ‘ 48 πŸ” 3 πŸ’¬ 2 πŸ“Œ 0

we’re at a fascinating moment where I am still ~better at programming than Claude at a medium-horizon difficulty task, but Claude has me absolutely beat in terms of cognitive fatigue so we’re able to ship so much more stuff I never would’ve gotten around to before

02.01.2026 20:56 πŸ‘ 99 πŸ” 4 πŸ’¬ 2 πŸ“Œ 0

Great list of models in 2025 πŸ‘πŸ½

02.01.2026 17:14 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
arXiv AI/ML Catch-Up Was your New Year's resolution to keep up with arXiv AI/ML preprints? Browse the past week's new uploads in 30 mins.

I uh, made this. It was supposed to be a joke / concept-art thing that scrolls through the torrent of new AI/ML arXiv uploads too fast to read. But I think I iterated too much and made it almost usable.

01.01.2026 23:45 πŸ‘ 78 πŸ” 13 πŸ’¬ 7 πŸ“Œ 3

Everyone’s favorite feed is running on one person’s gaming system. I love how hackable this site is, it makes it much more fun.

26.12.2025 18:47 πŸ‘ 22 πŸ” 1 πŸ’¬ 3 πŸ“Œ 0

If you’re working on a non-fiction research/writing project that isn’t journalism and you don’t have an academic affiliation, how do you find other people who are doing the same thing? Ideally locally (I’m in NY).

22.12.2025 01:53 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Owning group data Thinking about how communities can manage shared data on and off ATProto

local first vs atproto!! what should the source of truth for group data be?

20.12.2025 12:45 πŸ‘ 37 πŸ” 11 πŸ’¬ 1 πŸ“Œ 1
Post image

I am late to the game but I finally read the NeurIPS 2025 best paper on gating in LLMs, it is great.

Qiu et al.
Alibaba, U Edinburg, Stanford, MIT, Tsinghua U
arxiv.org/abs/2505.06708

1/3

15.12.2025 16:42 πŸ‘ 12 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0