Common Corpus just breaking 1M downloads: it took some time but open data in ai is actually popular.
Common Corpus just breaking 1M downloads: it took some time but open data in ai is actually popular.
A little offended Grammarly didn't make a sloppelganger of me
βautoresearchβ micro teaching repo from Karpathy
readme edits seem like such a nice dx for open ended hparam tuning, and maybe other kinds of hill climbing too, so much less painful than the old days
github.com/karpathy/aut...
It was all about spying on Americans: www.theatlantic.com/technology/2...
FlashSampling: Fast and Memory-Efficient Exact Sampling
Paper: flashsampling.github.io/FlashSamplin...
We analyzed 250K+ queries & 430K+ clickstream interactions from Asta, our AI-powered research assistantβand today we're releasing the full dataset. How do researchers actually use AI science tools? Here's what we found. π§΅
new blog post on permissioned data in atproto! this one introduces "buckets", the protocol-level primitive for shared access control. I walk through two approaches that don't quite work and land on something that I think does
let me know your thoughts!
tldr iiuc we are once again enclosing the commons and industrializing craft, dispossessing laborers while apotheosizing capital, and to slow down this doomloop we need to innovate new collectives and public goods
This has a very cool result on in-context learned classification tasks, where they disentangle representational quality (how well-separated concept labels are) and readout alignment (how good it is at reading out its own inner labels). Adding demo examples helps through readout, not representations!
Designing around the tight bottleneck on latency and throughput that separates local and cloud compute is such an interesting problem. Significant challenges though
Anti-homeless benches in Pokemon Legends ZA
why is there anti-homeless architecture in pokemon
A year ago, data center developers were focused on connecting to the grid. Today roughly 1/3 of all planned capacity is onsite power - and 72% of that planned capacity is fossil gas. Homer City PA's data center project could soon be one of the largest single sources of carbon emissions in the US.
Announcing our latest paper: CommonLID
In collaboration with @commoncrawl.bsky.social @mlcommons.org @jhu.edu we built a LID benchmark on actual Common Crawl text covering 109 languages. Existing evaluations overestimate how well LangID works on web data.
arxiv.org/abs/2601.18026
warning: earnestpost
thanks Caleb
extremely poor safekeeping of a studentβs private data
as a tech worker I think itβs very disturbing to see Google endangering its own users
working on a seven thousand layer model of extended claugenition
New work by my former PhD student, Boyang Li
His team produced 500 stories of less than 100 words. LLMs were basically chance-level at answering binary questions about the stories
arxiv.org/abs/2601.12410
This is a real banger of a paper. The example of a model being weirdly focused on jasmine (lol) makes me increasingly think that single-point-of-access models don't really consider who their audience is. Jasmine is a super legible cultural marker for people outside, but is so, _so_ generic.
This was a colossal multi-year effort driven by an incredible team that gave this everything: Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, Zico Kolter. Much more in the paper! arxiv.org/abs/2601.03220 7/7
the reason I'd follow Cat Hicks into hell is this unswerving humanist conviction that actually
people are going to do the best they can
we can help them do even better
and neither avenue is served by thinking less of people
i think we are about to experience an explosion of the possibilities in reverse engineering
weβre at a fascinating moment where I am still ~better at programming than Claude at a medium-horizon difficulty task, but Claude has me absolutely beat in terms of cognitive fatigue so weβre able to ship so much more stuff I never wouldβve gotten around to before
Great list of models in 2025 ππ½
I uh, made this. It was supposed to be a joke / concept-art thing that scrolls through the torrent of new AI/ML arXiv uploads too fast to read. But I think I iterated too much and made it almost usable.
Everyoneβs favorite feed is running on one personβs gaming system. I love how hackable this site is, it makes it much more fun.
If youβre working on a non-fiction research/writing project that isnβt journalism and you donβt have an academic affiliation, how do you find other people who are doing the same thing? Ideally locally (Iβm in NY).
local first vs atproto!! what should the source of truth for group data be?
I am late to the game but I finally read the NeurIPS 2025 best paper on gating in LLMs, it is great.
Qiu et al.
Alibaba, U Edinburg, Stanford, MIT, Tsinghua U
arxiv.org/abs/2505.06708
1/3