Eduardo Pignatelli's Avatar

Eduardo Pignatelli

@epignatelli.com

Assistant professor (UK Lecturer) at @UCL. PhD at @UCL. Past architect. Previously ML Lead at @burohappold. RL, credit assignment, reward-genesis.

18
Followers
12
Following
1
Posts
22.11.2024
Joined
Posts Following

Latest posts by Eduardo Pignatelli @epignatelli.com

Great to see BALROG on @bsky.app as well!

25.11.2024 15:00 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

Tired of saturated benchmarks? Want scope for a significant leap in capabilities?

๐Ÿ”ฅ Introducing BALROG: a Benchmark for Agentic LLM and VLM Reasoning On Games!

BALROG is a challenging benchmark for LLM agentic capabilities, designed to stay relevant for years to come.

1/๐Ÿงต

21.11.2024 16:24 ๐Ÿ‘ 95 ๐Ÿ” 20 ๐Ÿ’ฌ 4 ๐Ÿ“Œ 7