Taku Ito (@takuito) — bluesky.baby

Bullshit Bench V2

new: 100 questions across several domains

- Anthropic & Qwen still on top
- Reasoning seems to hurt
- New models are *not* better than old (except Claude)
- Seems to be independent of domain

github.com/petergpt/bul...

02.03.2026 16:23 👍 102 🔁 11 💬 6 📌 3

Text-to-LoRA: Instant Transformer Adaption While Foundation Models provide a general tool for rapid content creation, they regularly require task-specific adaptation. Traditionally, this exercise involves careful curation of datasets and repea...

Sakana has developed a way to, if I understand correctly, instantly generate LORAs on demand from long texts or documents

arxiv.org/abs/2506.06105
arxiv.org/abs/2602.15902

27.02.2026 05:51 👍 54 🔁 6 💬 3 📌 4

US science after a year of Trump: what has been lost and what remains A series of graphics reveals how the Trump administration has sought historic cuts to science and the research workforce.

Trump has been in office for one year. We at @nature.com did a deep dive looking at the administration's disruption of science in numbers.

Take a look—the numbers are staggering. By me, @dangaristo.bsky.social, Jeff Tollefson, @kimay.bsky.social, & help from @noamross.net @scott-delaney.bsky.social

20.01.2026 18:08 👍 505 🔁 318 💬 10 📌 30

This line graph illustrates the percentage change in agency staff levels from the previous year for nine major U.S. federal scientific and health organizations between the fiscal years 2016 and 2025. The agencies tracked include the CDC, Department of Energy, EPA, FDA, NASA, NIH, NIST, NOAA, and NSF. For the majority of the timeline between 2016 and 2023, the agencies show relatively stable fluctuations, generally staying within a range of +5% to -5% change per year. However, there is a dramatic and uniform plummet starting in the 2024–25 period. Every agency depicted shows a sharp downward trajectory, with staffing losses ranging from approximately -15% to over -25%. The Environmental Protection Agency (EPA) shows the most significant decline, dropping to roughly -26%, while the National Institute of Standards and Technology (NIST) shows the least severe but still substantial drop at approximately -15%.

This is the most astonishing graph of what the Trump regime has done to US science. They have destroyed the federal science workforce across the board. The negative impacts on Americans will be felt for generations, and the US might never be the same again.

www.nature.com/immersive/d4...

20.01.2026 22:53 👍 14449 🔁 8316 💬 90 📌 765

One of my favorite findings: Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.

We found that if you simply delete them after pretraining and recalibrate for <1% of the original budget, you unlock massive context windows. Smarter, not harder.

12.01.2026 04:12 👍 220 🔁 32 💬 8 📌 1

Oh wow, deepseek is starting to make serious progress on LLMs that offload memory to external storage: github.com/deepseek-ai/...

12.01.2026 18:44 👍 219 🔁 25 💬 6 📌 8

Schematic depicting cortical-subcortical interactions during multi-task learning

Excited to see our paper with @mwcole.bsky.social finally out in peer-reviewed form @natcomms.nature.com! We examine how the human brain learns new tasks and optimizes representations over practice…1/n

19.11.2025 18:03 👍 24 🔁 7 💬 1 📌 0

AI discovers learning algorithm that outperforms those designed by humans An artificial-intelligence algorithm that discovers its own way to learn achieves state-of-the-art performance, including on some tasks it had never encountered before.

Did you know that AI can figure out its own way to learn, and that its way is better than one designed by humans? Read more in a @nature.com N&V (and the original paper is in the comment) 🧪 www.nature.com/articles/d41...

24.10.2025 13:18 👍 6 🔁 2 💬 2 📌 0

Our work with @pawa-pawa.bsky.social is out in Nature Machine Intelligence! The choice of activation function affects the representations, dynamics, and circuit solutions that emerge in RNNs trained on cognitive tasks. Activation matters!
www.nature.com/articles/s42...

24.10.2025 19:18 👍 42 🔁 11 💬 0 📌 0

(repost welcome) The Generative Model Alignment team at IBM Research is looking for next summer interns! Two candidates for two topics

🍰Reinforcement Learning environments for LLMs

🐎Speculative and non-auto regressive generation for LLMs

interested/curious? DM or email ramon.astudillo@ibm.com

07.10.2025 20:19 👍 19 🔁 14 💬 1 📌 1

Why I left academia and neuroscience Don't worry, this isn't yet another story of rage-quitting.

Michael X Cohen on why he left academia/neuroscience.
mikexcohen.substack.com/p/why-i-left...

06.10.2025 17:05 👍 95 🔁 36 💬 7 📌 14

Arousal as a universal embedding for spatiotemporal brain dynamics - Nature Reframing of arousal as a latent dynamical system can reconstruct multidimensional measurements of large-scale spatiotemporal brain dynamics on the timescale of seconds in mice.

Nature research paper: Arousal as a universal embedding for spatiotemporal brain dynamics

go.nature.com/4nMUgYz

26.09.2025 10:26 👍 31 🔁 12 💬 0 📌 2

Lab’s latest is out in Imaging Neuroscience, led by Kirsten Peterson: “Regularized partial correlation provides reliable functional connectivity estimates while correcting for widespread confounding”, where we demonstrate a major improvement to standard fMRI functional connectivity (correlation) 1/n

14.09.2025 21:34 👍 75 🔁 30 💬 6 📌 0

Can AI generate truly novel algorithms? A decades-old approach to measuring algorithmic complexity could provide a window into better understanding how AI systems compute.

Formalizing AI computation in terms of algorithmic complexity can offer a formal way to quantify AI systems while offering a principled foundation to build more algorithmically capable systems in the future.
Blog: research.ibm.com/blog/ai-algo... arXiv: arxiv.org/abs/2411.05943

19.08.2025 22:43 👍 0 🔁 0 💬 0 📌 0

While using AI models to generate code is commonplace these days, we still do not fully understand the limits of the complexity of the code these models can formulate.
3/n

19.08.2025 22:40 👍 0 🔁 0 💬 1 📌 0

Using circuits to formalize algorithmic problems for AI models (e.g., depth as time complexity, size as space complexity), we can quantify the complexity of circuit computations (algorithmic complexity) an AI model can perform.
2/n

19.08.2025 22:39 👍 0 🔁 0 💬 1 📌 0

What complexity of algorithms can AI compute? In a new paper with colleagues at IBM Research, we explore how circuit complexity theory can help quantify the degree of algorithmic generalization in AI systems. www.nature.com/articles/s42...
@natmachintell.nature.com
#ML #AI #MLSky
1/n

19.08.2025 22:38 👍 17 🔁 5 💬 1 📌 1

Mental health research is at a turning point—breakthroughs can transform lives, but only with bold action, investment, and open collaboration. The time for action is now. Read our full statement here: childmind.org/blog/can-sci...

07.03.2025 20:17 👍 15 🔁 7 💬 0 📌 0

Out today in Nature Machine Intelligence!

From childhood on, people can create novel, playful, and creative goals. Models have yet to capture this ability. We propose a new way to represent goals and report a model that can generate human-like goals in a playful setting... 1/N

21.02.2025 16:29 👍 135 🔁 40 💬 5 📌 4

New preprint! Ziyan and I explore how task order impacts continual learning in neural networks and how to optimize it. Our analysis highlights two key principles for better task sequencing.
Check it out: arxiv.org/pdf/2502.03350

06.02.2025 23:14 👍 7 🔁 3 💬 0 📌 0

The entire website for the NIH Office of Research on Women's Health (ORWH) is very nearly stripped bare. This is so, so devastating. orwh.od.nih.gov/research/fun...

31.01.2025 18:25 👍 978 🔁 617 💬 44 📌 65

Discretized representations in V1 predict suboptimal orientation discrimination - Nature Communications How animals generate perceptual decisions remains poorly understood. Here, the authors show that during a discrimination task, the mouse visual cortex does not encode the orientations of the cues but ...

New paper out! 🚨 📰 With @batuhanerkat.bsky.social, John McClure, @hussainyk1.bsky.social, @polacklab.bsky.social we reveal how discretized representations in V1 predict suboptimal orientation discrimination. 🧪🧠🐭 This work reconciles neuro and psychometric curves
www.nature.com/articles/s41...

08.01.2025 21:43 👍 26 🔁 8 💬 3 📌 0

New paper in @brain1878.bsky.social: Healthy people under S-ketamine, an NMDAR antagonist, and people living with schizophrenia, a disorder associated with NMDAR hypofunction, spend more time in an external mode of perception - where noisy sensory signals override knowledge about the world.

19.01.2025 21:18 👍 26 🔁 8 💬 1 📌 2

The origin of color categories | PNAS To what extent does concept formation require language? Here, we exploit color to address this question and ask whether macaque monkeys have color ...

The origin of color categories | PNAS www.pnas.org/doi/10.1073/...

16.01.2025 15:59 👍 52 🔁 13 💬 2 📌 6

Check our latest in which we leverage shape metrics to compare neural geometry across regions, sessions or subjects and how their differences predict behavior.

w/ Nejatbakhsh, Duong, @sarah-harvey.bsky.social, Brincat, @siegellab.bsky.social, @earlkmiller.bsky.social & @itsneuronal.bsky.social

12.01.2025 15:19 👍 103 🔁 37 💬 3 📌 1

Paper shows very small LLMs can match or beat larger ones through 'deep thinking' - evaluating different solution paths - and other tricks. Their 7B model beats o1-preview on complex math by exploring 64 different solutions & picking the best one.

Test-time compute paradigm seems really fruitful.

11.01.2025 05:34 👍 157 🔁 20 💬 3 📌 4

Linking neural population formatting to function Animals capable of complex behaviors tend to have more distinct brain areas than simpler organisms, and artificial networks that perform many tasks tend to self-organize into modules (1-3). This sugge...

New results for a new year! “Linking neural population formatting to function” describes our modern take on an old question: how can we understand the contribution of a brain area to behavior?
www.biorxiv.org/content/10.1...
🧠👩🏻‍🔬🧪🧵
#neuroskyence
1/

04.01.2025 16:25 👍 232 🔁 82 💬 2 📌 7

AI and Stress 200Bn Weights of Responsibility The Stress of Working in Modern AI Felix Hill, Oct 2024 The field of AI has changed irrevocably in the last 2 years. ChatGPT is approaching 200m monthly users. Gemin...

And relatedly, Felix wrote a good piece on the stress and anxiety currently affecting many people who work in AI due to the current climate in the industry:

docs.google.com/document/d/1...

If only more folks in AI were gentle and introspective like this...

03.01.2025 20:05 👍 17 🔁 5 💬 0 📌 0

What was the most important machine learning paper in 2024?

My Famous Deep Learning Papers list (that I use in teaching) does not include any new ideas from the last year.

papers.baulab.info

Which single new paper would you add?

31.12.2024 15:09 👍 55 🔁 11 💬 10 📌 0

Did OpenAI Just Solve Abstract Reasoning? OpenAI’s o3 model aces the "Abstraction and Reasoning Corpus" — but what does it mean?

Some of my thoughts on OpenAI's o3 and the ARC-AGI benchmark

aiguide.substack.com/p/did-openai...

23.12.2024 14:38 👍 339 🔁 98 💬 16 📌 26

Taku Ito

Latest posts by Taku Ito @takuito