1/13 New Paper!! We try to understand why some LMs self-improve their reasoning while others hit a wall. The key? Cognitive behaviors! Read our paper on how the right cognitive behaviors can make all the difference in a model's ability to improve with RL! π§΅
04.03.2025 18:15
π 57
π 17
π¬ 2
π 3
Thank you to @sloanfoundation.bsky.social for this generous award to our lab. Hopefully this will bring us closer to building truly general-purpose robots!
18.02.2025 16:50
π 22
π 4
π¬ 3
π 0
(Many) more details in our paper! arxiv.org/abs/2410.02749
12.02.2025 20:08
π 0
π 0
π¬ 0
π 0
LMs trained to synthesize programs by repeatedly editing their own generations produce more diverse code compared to baselines
This improves the trade-off between test-time FLOPs and pass@k
12.02.2025 20:08
π 1
π 0
π¬ 1
π 0
Our approach introduces an algorithm, LintSeq, for sampling across interdependent lines in source code by using a code linter
With LintSeq, we can generate plausible edit *trajectories* for any source code file, covering possible ways of synthesizing its contents edit-by-edit with no linter errors
12.02.2025 20:08
π 1
π 0
π¬ 1
π 0
Our paper showing that LMs benefit from human-like abstractions for code synthesis was accepted to ICLR! πΈπ¬
We show that order matters in code gen. -- casting code synthesis as a sequential edit problem by preprocessing examples in SFT data improves LM test-time scaling laws
12.02.2025 20:08
π 10
π 2
π¬ 1
π 1
Can we extend the power of world models beyond just online model-based learning? Absolutely!
We believe the true potential of world models lies in enabling agents to reason at test time.
Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.
31.01.2025 19:24
π 20
π 8
π¬ 1
π 1
Williams and Zipser (1989) is a classic one! leech.cybernoid.gr/files/text/p...
30.01.2025 17:47
π 5
π 0
π¬ 2
π 0
Introducing π§Genie 2 π§ - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents π§ .
04.12.2024 16:01
π 235
π 61
π¬ 15
π 30
Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.
go.bsky.app/MdVxrtD
20.11.2024 07:08
π 105
π 32
π¬ 16
π 5