Antonin Poché (@antoninpoche)

Ho and I also made a longer video with a voice-over if it's useful to anyone.

🔊

04.03.2026 10:10 👍 0 🔁 0 💬 0 📌 0

If you are interested in the library, you can check out the corresponding thread below:
bsky.app/profile/anto...

Or the GitHub directly: github.com/FOR-sight-ai...

04.03.2026 10:10 👍 0 🔁 0 💬 1 📌 0

🔥Super excited to share our new demo website for 🪄Interpreto!

🖼️It is basically an explanation gallery showcasing attribution and concept-based explanations for classification and generation.

🎮Play with it: for-sight-ai.github.io/interpreto-d...

We will keep improving it, so stay tuned!

04.03.2026 10:10 👍 8 🔁 2 💬 1 📌 0

I also did a thread to present the library quickly:

bsky.app/profile/anto...

23.01.2026 13:52 👍 0 🔁 0 💬 0 📌 0

Pleasently surprised to see our blog post trending on HuggingFace 🤗

Well, @fannyjrd.bsky.social did a great job! 🚀

If you missed it, check it out: huggingface.co/blog/Fannyjr...

It's a didactic presentation of our new library: 🪄 Interpreto:
github.com/FOR-sight-ai...

23.01.2026 13:51 👍 3 🔁 1 💬 1 📌 0

It was an honor to be part of this awesome project! Interpreto is a great up-and-coming tool for concept-based interpretability analyses of NLP models, check it out!

21.01.2026 04:20 👍 7 🔁 1 💬 0 📌 0

GitHub - FOR-sight-ai/interpreto: 🪄 Interpreto is an interpretability toolbox for LLMs 🪄 Interpreto is an interpretability toolbox for LLMs - FOR-sight-ai/interpreto

🎉 I’m thrilled to announce the release of Interpreto: a user-friendly, open-source toolbox to make NLP model interpretability accessible, practical, and rigorous.
github.com/FOR-sight-ai...
🧵1/5

20.01.2026 17:32 👍 7 🔁 1 💬 1 📌 0

GitHub - FOR-sight-ai/interpreto: 🪄 Interpreto is an interpretability toolbox for LLMs 🪄 Interpreto is an interpretability toolbox for LLMs - FOR-sight-ai/interpreto

📦You can find the library on GitHub: github.com/FOR-sight-ai...

📚Access the documentation: for-sight-ai.github.io/interpreto/

⏬Download with pip: `uv pip install interpreto`

📰Look at our paper: arxiv.org/abs/2512.097...

🤗 Check our Huggingface blog post: huggingface.co/blog/Fannyjr...

8/8

20.01.2026 16:10 👍 4 🔁 1 💬 0 📌 0

🔥The amazing team: @fannyjrd.bsky.social, Thomas Mullor, @gsarti.com, Frédéric Boisnard, Corentin Friedrich, Charlotte Claye, François Hooft, and Raphaël Bernas!!

🙏And to the supporters: IRT Saint Exupery, ANITI, @centralesupelec.bsky.social, DEEL.ai and FOR projects.

7/8

20.01.2026 16:10 👍 2 🔁 0 💬 1 📌 0

Overview - Interpreto Interpretability Toolkit for LLMs

You can do all these steps in interpreto using a wide range of methods.

Check out the documentation for more details: for-sight-ai.github.io/interpreto/a...

Or the tutorials:

- for-sight-ai.github.io/interpreto/n...
- for-sight-ai.github.io/interpreto/n...

6/8

20.01.2026 16:10 👍 1 🔁 0 💬 1 📌 0

For concepts, there are 4 steps:

1. Split the model and get activations. (wraps `nnsight` @ndif-team.bsky.social)

2. Find patterns in activations (SAEs...) (wraps `overcomplete` @thomasfel.bsky.social )

3. Interpret the concepts

4. Estimate concepts' contributions to the output

5. Evaluate

5/8

20.01.2026 16:10 👍 2 🔁 0 💬 1 📌 0

💡Interpreto provides concept-based explanations (post-hoc unsupervised), part of the Mechanistic Interpretability field. Concepts answer:

❔What higher-level features exist inside the model’s hidden space, and how do they affect outputs?

4/8

⬇️Example on the AG News dataset.

20.01.2026 16:10 👍 1 🔁 0 💬 1 📌 0

🔥 We implement the classic attribution methods. Both `ForSequenceClassification` and `ForCausalLM`.

There are both perturbation-based ➡️ and gradient-based methods 🔁. About 10 methods globally.

📊There are two metrics.

🔹🔷🟦You can fix the granularity of explanations.

3/8

20.01.2026 16:10 👍 1 🔁 0 💬 1 📌 0

🎓➡️👥The goal of the library is to bridge the gap between practitioners applying interpretability methods and the SOTA.

🚀The library is still in active development. Hence, we welcome your feedback and contributions. 🤗

👋📨 Raise an issue, open a PR, or contact us.

2/8

20.01.2026 16:10 👍 2 🔁 0 💬 1 📌 0

🔥I am super excited for the official release of an open-source library we've been working on for about a year!

🪄interpreto is an interpretability toolbox for HF language models🤗. In both generation and classification!

Why do you need it, and for what?

1/8 (links at the end)

20.01.2026 16:03 👍 20 🔁 9 💬 1 📌 3

If you use GMail, AI (Gemini) was turned on yesterday by default and now scans all of your content for machine learning. To turn off, go to Settings>General and scroll down. Uncheck the box for "Smart features."

There's other "Smart" add-ons as well, but that's the one that reads your content.

20.11.2025 17:32 👍 10768 🔁 8014 💬 326 📌 787

🕳️🐇 𝙄𝙣𝙩𝙤 𝙩𝙝𝙚 𝙍𝙖𝙗𝙗𝙞𝙩 𝙃𝙪𝙡𝙡 – 𝙋𝙖𝙧𝙩 𝙄 (𝑃𝑎𝑟𝑡 𝐼𝐼 𝑡𝑜𝑚𝑜𝑟𝑟𝑜𝑤)

𝗔𝗻 𝗶𝗻𝘁𝗲𝗿𝗽𝗿𝗲𝘁𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗱𝗲𝗲𝗽 𝗱𝗶𝘃𝗲 𝗶𝗻𝘁𝗼 𝗗𝗜𝗡𝗢𝘃𝟮, one of vision’s most important foundation models.

And today is Part I, buckle up, we're exploring some of its most charming features. :)

14.10.2025 21:00 👍 36 🔁 12 💬 2 📌 0

expressing appreciation for this scientific diagram

05.10.2025 20:55 👍 50 🔁 7 💬 3 📌 0

Can it be biased by people answering randomly.

If you have like 1 person over 5 answering randomly on the other guessing correctly, wouldn't you obtain your blue curve?

22.09.2025 18:21 👍 1 🔁 0 💬 0 📌 0

Want the full story behind the poster? 🎉
I broke down the methodology and results here 👇

25.07.2025 15:38 👍 0 🔁 0 💬 0 📌 0

🔥 I am super excited to be presenting a poster at #ACL2025 in Vienna next week! 🌏

This is my first big conference!

📅 Tuesday morning, 10:30–12:00, during Poster Session 2.

💬 If you're around, feel free to message me. I would be happy to connect, chat, or have a drink!

25.07.2025 15:37 👍 5 🔁 1 💬 1 📌 0

🚨 New preprint! 🚨

Everyone loves causal interp. It’s coherently defined! It makes testable predictions about mechanistic interventions! But what if we had a different objective: predicting model behavior not under mechanistic interventions, but on unseen input data?

10.07.2025 14:30 👍 63 🔁 12 💬 3 📌 2

🔥ConSim has been accepted to the #ACL2025 main conference!

🙏 Thanks again to my amazing co-authors: @alon_jacovi, Agustin Picard, @VictorBoutin, and @Fannyjrd_.

Work done in DEEL and FOR from IRT St Exupéry and @ANITI_Toulouse.

See you in Vienna 📅

For more information, check out my last post:

16.05.2025 08:45 👍 4 🔁 1 💬 1 📌 0

BlackboxNLP is back! 💥

Happy to be part of the organizing team for this year, and super excited for our new shared task using the excellent MIB Benchmark, check it out! blackboxnlp.github.io/2025/task/

15.05.2025 08:24 👍 6 🔁 2 💬 0 📌 0

🎉 Our Actionable Interpretability workshop has been accepted to #ICML2025! 🎉
> Follow @actinterp.bsky.social
> Website actionable-interpretability.github.io

@talhaklay.bsky.social @anja.re @mariusmosbach.bsky.social @sarah-nlp.bsky.social @iftenney.bsky.social

Paper submission deadline: May 9th!

31.03.2025 16:59 👍 42 🔁 16 💬 3 📌 3

The biggest reason government officials aren't giving any specifics about the criteria by which these arrests and deportations are selected, is that the criteria is "pro-Israel think-tanks and advocacy organizations created lists of troublesome individuals and gave them to us."

Hundreds of international students have just received an email telling them their visas have been revoked.

The ‘justification’ is campus activism or social media posts.

timesofindia.indiatimes.com/world/us/hun...

29.03.2025 14:11 👍 5657 🔁 3086 💬 208 📌 610

On the Biology of a Large Language Model

Can we understand the mechanisms of a frontier AI model?

📝 Blog post: www.anthropic.com/research/tra...
🧪 "Biology" paper: transformer-circuits.pub/2025/attribu...
⚙️ Methods paper: transformer-circuits.pub/2025/attribu...

Featuring basic multi-step reasoning, planning, introspection and more!

27.03.2025 18:18 👍 125 🔁 28 💬 4 📌 3

Jawdropping.

You would expect this in a dictatorship, not the United States.

This country is unrecognizable.

20.03.2025 02:11 👍 18712 🔁 7691 💬 1405 📌 825

What will be the linchpin for AI dominance?

Read our NSF/OSTP recommendations written with Goodfire's Tom McGrath tommcgrath.github.io, Transluce's Sarah Schwettmann cogconfluence.com, MIT's Dylan Hadfield-Menell @dhadfieldmenell.bsky.social

TLDR; Dominance comes from **interpretability** 🧵 ↘️

16.03.2025 13:57 👍 21 🔁 8 💬 1 📌 1

An assembly of 18 European companies, labs, and universities have banded together to launch 🇪🇺 EuroBERT!

It's a state-of-the-art multilingual encoder for 15 European languages, designed to be finetuned for retrieval, classification, etc.

Details in 🧵

10.03.2025 09:43 👍 80 🔁 20 💬 5 📌 1

Antonin Poché

Latest posts by Antonin Poché @antoninpoche