@ucl-dark.bsky.social entered the stage! Thanks @lauraruis.bsky.social :)
@ucl-dark.bsky.social entered the stage! Thanks @lauraruis.bsky.social :)
Check out Tim's start pack for Open-Endedness on Bluesky!
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledgeπ¦? In our new preprint, we look at the pretraining data and find evidence against this:
Procedural knowledge in pretraining drives LLM reasoning βοΈπ’
π§΅β¬οΈ
The LLM parrot analogy is dead. Fantastic work by UCL DARK's @lauraruis.bsky.social on rigorously investigating whether LLMs learn reasoning from procedural knowledge during pretraining.
Excited to announce "BALROG: a Benchmark for Agentic LLM and VLM Reasoning On Games" led b UCL DARK's @dpaglieri.bsky.social! Douwe Kiela plot below is maybe the scariest for AI progress β LLM benchmarks are saturating at an accelerating rate. BALROG to the rescue. This will keep us busy for years.