We validate everything empirically on 11 models (GPT-2, Gemma 3, Qwen 3, Llama 2, Mistral 7B) across 8 safety-related concepts. All our theorems are confirmed experimentally.
6/7
09.03.2026 08:55
👍 0
🔁 0
💬 1
📌 0
Result 3: Steering always hurts global performance.
Cross-entropy increases quadratically around α = 0, i.e., there's no free lunch. For large α, performance plateaus because the output becomes input-independent.
5/7
09.03.2026 08:55
👍 1
🔁 0
💬 1
📌 0
Result 2: Concept probability follows a sigmoidal curve.
The probability that the target concept appears in the output increases smoothly with α, following a tanh shape. Off-target concepts decrease or vanish.
4/7
09.03.2026 08:55
👍 0
🔁 0
💬 1
📌 0
Result 1: Next-token probabilities have a bump shape.
As α increases, most token probabilities rise, peak for α>0, then fall. In particular, off-target tokens peak _before_ target tokens.
3/7
09.03.2026 08:55
👍 0
🔁 0
💬 1
📌 0
Take contrastive prompts (for ex, safe vs malicious), evaluate them through the model, compute the mean difference of representations at layer ℓ. That's your steering vector v. Then shift activations by αv at layer ℓ.
Question: what does α actually do?
2/7
09.03.2026 08:55
👍 0
🔁 0
💬 1
📌 0
🧵 New preprint! Towards Understanding Steering Strength w/ M. Taimeskhanov and D. Garreau
Activation steering is a popular way to control LLM behavior at inference. But how much should you steer? We provide the first theoretical analysis of the steering strength α.
📄 arxiv.org/abs/2602.02712
1/7
09.03.2026 08:55
👍 9
🔁 3
💬 1
📌 0
Je pensais que c'était une façon de parler le 30, du coup j'ai été voir. watzeactualfuk?
18.02.2026 10:36
👍 1
🔁 1
💬 0
📌 0
Bullshit is like entropy nowadays it can only increase.
29.01.2026 18:25
👍 2
🔁 1
💬 1
📌 0
My god I never thought about it in this way. This is just the second law of thermodynamics 😍!
29.01.2026 18:49
👍 2
🔁 0
💬 1
📌 0
The issue is that fighting bullshit (LLMs or not) requires more effort than generating it (LLMs or not). We are doomed to lose.
29.01.2026 17:29
👍 0
🔁 0
💬 1
📌 0
It’s incredible that the growth rate of administrative bullshit outpaces LLM progress, but here we are.
29.01.2026 13:35
👍 4
🔁 0
💬 1
📌 0
but what about all the meme then?
23.01.2026 12:36
👍 1
🔁 0
💬 0
📌 0
Vu que le chauffage est visiblement un détail, on trouve des alternatives. Startup-nation 🤘
22.01.2026 18:42
👍 5
🔁 0
💬 0
📌 0
À deux doigts de me traiter de jeune 🥹
20.01.2026 18:00
👍 2
🔁 0
💬 2
📌 0
S-I-X postes ?!
20.01.2026 09:20
👍 3
🔁 0
💬 2
📌 0
Tu sais te mettre de bonne humeur le samedi matin toi 😐
17.01.2026 07:53
👍 1
🔁 0
💬 1
📌 0
After the 2168 series, I read @fabinou.bsky.social series on building a Quake PC fabiensanglard.net/quake_pc/ (late 90s-style), it brings back so much memories 😍!
17.01.2026 07:51
👍 3
🔁 0
💬 0
📌 0
So I would say that much of it is probably the reference voice, and some "higher order" features (to speak in a old-fashioned way) are manipulated around this reference.
17.01.2026 07:40
👍 2
🔁 0
💬 0
📌 0
So I tested with my voice (speaking only French except maybe two words).
- When asking pocket-tts to generate English, this is really my voice with my horrible accent.
- When asking to generate French, it starts to be weird 😅
17.01.2026 07:40
👍 2
🔁 0
💬 1
📌 0
GitHub - kyutai-labs/pocket-tts: A TTS that fits in your CPU (and pocket)
A TTS that fits in your CPU (and pocket). Contribute to kyutai-labs/pocket-tts development by creating an account on GitHub.
I tested Pocket TTS from Kyutai. It's fun to use for generating French sentences: the accent is strong, but it's still perfectly understandable, even though French doesn't seem to be in the training dataset!
Beyond that, the model is quite impressive especially knowing it has only 100M parameters
16.01.2026 15:49
👍 12
🔁 0
💬 1
📌 0
Extension pour soumettre une communication : deadline au 21 janvier 23:59 !
16.01.2026 11:53
👍 1
🔁 1
💬 0
📌 0
Journées SMAI-MODE 2026 - Sciencesconf.org
Les journées SMAI-MODE, la conférence biennale du groupe MODE de la Société de Mathématiques Appliquées et Industrielles (SMAI), se tiendront du 18 au 20 mars 2026 à Nice.
Pour s'inscrire et pour plus d'informations : mode2026.sciencesconf.org
Les orateurs et oratrices pléniers sont Pierre Ablin, Yann Brenier, Julie Delon, Stéphane Gaubert, Francisco J. Silva Álvarez (prix J.J Moreau) et Irène Waldspurger.
Attention, le nombre de place est limité !
15.12.2025 08:11
👍 0
🔁 0
💬 0
📌 0
Les inscriptions aux journées MODE 2026 à Nice sont désormais ouvertes. Elles se dérouleront du 18 au 20 mars à l'Hôtel Saint-Paul.
Les inscriptions sont ouvertes jusqu'au 1 mars (majoration > 9/02). La deadline pour soumettre une communication est le **15 janvier**.
15.12.2025 08:11
👍 5
🔁 7
💬 1
📌 1
Mail annuel du CNRS pour savoir si tu as obtenu le prix Nobel ✅
11.12.2025 09:07
👍 4
🔁 0
💬 1
📌 0
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers. R. Kawata · Y. Song · A. Bietti · N. Nishikawa · T. Suzuki · SV · D. Wu. Wed, Dec 3, 2025 • 4:30 PM – 7:30 PM PST, #4001 (spotlight)
Differentiable Generalized Sliced Wasserstein Plans. L. Chapel · R. Tavenard · SV. Fri, Dec 5, 2025 • 11:00 AM – 2:00 PM PST, #1000
Learning Theory for Kernel Bilevel Optimization. F. El Khoury · E. Pauwels · SV · M. Arbel. Fri, Dec 5, 2025 • 4:30 PM – 7:30 PM PST, #3005
I am at #NeurIPS2025, reach out if you want to chat!
03.12.2025 16:19
👍 1
🔁 1
💬 0
📌 0
Happy to be at #NeurIPS in San Diego to present our poster ‘Learning Theory for Kernel Bilevel Optimization’ #3005, Fri at 4:30 p.m. Stop by/ping me to chat, especially about statistics, causality, generative models! Let's connect!
Joint w/ E. Pauwels, @samuelvaiter.com, @michael-arbel.bsky.social
01.12.2025 20:19
👍 3
🔁 1
💬 0
📌 0