Laurence Aitchison's Avatar

Laurence Aitchison

@laurenceai

Lecturer at the University of Bristol. probabilistic ML, optimisation, interpretability, LLM evals.

223
Followers
496
Following
5
Posts
19.08.2023
Joined
Posts Following

Latest posts by Laurence Aitchison @laurenceai

Post image

Our paper on the best way to add error bars to LLM evals is on arXiv! TL;DR: Avoid the Central Limit Theorem -- there are better, simple Bayesian and frequentist methods you should be using instead.

We also provide a super lightweight library: github.com/sambowyer/baye… πŸ§΅πŸ‘‡

06.03.2025 15:00 πŸ‘ 25 πŸ” 8 πŸ’¬ 1 πŸ“Œ 0

Go read it on arXiv! Thanks to my co-authors @sambowyer.bsky.social and @laurenceai.bsky.social πŸ’₯

06.03.2025 15:00 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

πŸ“£ Jobs alert

We’re hiring postdoc and research engineer to work on UQ for LLMs!! Details ⬇️

#ai #llm #uq

12.02.2025 16:26 πŸ‘ 13 πŸ” 11 πŸ’¬ 0 πŸ“Œ 0

Do you know what rating you’ll give after reading the intro? Are your confidence scores 4 or higher? Do you not respond in rebuttal phases? Are you worried how it will look if your rating is the only 8 among 3’s? This thread is for you.

27.11.2024 17:25 πŸ‘ 77 πŸ” 20 πŸ’¬ 4 πŸ“Œ 3

Would love to be added!

27.11.2024 21:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

But you can't prove that the *real* asteroid won't hit earth, because the real world isn't your simplified model. e.g. you don't know the initial conditions, there might be other bodies you aren't aware of etc. etc.

27.11.2024 10:44 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The analogy we're working from is "mathematically provable asteroid safety": within a simplified mathematical model, with known initial conditions, you can prove that an asteroid won't hit earth. (2/3)

27.11.2024 10:44 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Does anyone want to collaborate on an ICML position paper on "The impossibility of mathematically proving AI safety"? The basic thesis being that it is a category error to try to prove AI safety in the real world. (1/3)

27.11.2024 10:44 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Can you add?

26.11.2024 10:58 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0