Niklas Stoehr's Avatar

Niklas Stoehr

@niklasstoehr

Gemini Post-Training ⚫️ Research Scientist at Google DeepMind ⚫️ PhD from ETH Zurich

1,091
Followers
218
Following
7
Posts
16.11.2024
Joined
Posts Following

Latest posts by Niklas Stoehr @niklasstoehr

Post image

(1/n) We analyzed 3,984 agent skills from major marketplaces and found 76 malicious payloads, including credential theft, backdoor installation, and data exfiltration.

Also, 13.4% contain at least on critical-level vuln.

Full report below, highlights in thread 👇

github.com/invariantlab...

06.02.2026 16:33 👍 3 🔁 1 💬 1 📌 0
Post image

📣📣 New paper with #JoergFriedrichs, @florianschaffner.bsky.social, @niklasstoehr.bsky.social , “Populism and governmentalism as thin-centered ideologies: Emotions and frames on social media” is out at @ejprjournal.bsky.social

Read more: doi.org/10.1017/S147...

19.12.2025 11:57 👍 22 🔁 5 💬 1 📌 0
Post image

🔓What brought me into Machine Learning research is its universal applicability—pattern recognition underlies all empirical sciences and even lets you publish in top polscience journals such as EJPR: shorturl.at/8e1rF @giulianoformisano.bsky.social @florianschaffner.bsky.social and Joerg Friedrichs.

19.12.2025 12:21 👍 2 🔁 0 💬 0 📌 0
Post image

⚖️ Measuring Scalar Constructs in Social Science with LLMs

with rising (and established) stars in Computational Social Science

@haukelicht.bsky.social
@rupak-s.bsky.social
@patrickwu.bsky.social
@pranavgoel.bsky.social
@elliottash.bsky.social
@alexanderhoyle.bsky.social

arxiv.org/abs/2509.03116

17.11.2025 09:29 👍 15 🔁 4 💬 0 📌 0

Paper: arxiv.org/abs/2509.03116

Code: github.com/haukelicht/s...

With:
@haukelicht.bsky.social *
@rupak-s.bsky.social *
@patrickwu.bsky.social
@pranavgoel.bsky.social
@niklasstoehr.bsky.social
@elliottash.bsky.social

28.10.2025 06:20 👍 5 🔁 1 💬 0 📌 0
A diagram illustrating pointwise scoring with a large language model (LLM). At the top is a text box containing instructions: 'You will see the text of a political advertisement about a candidate. Rate it on a scale ranging from 1 to 9, where 1 indicates a positive view of the candidate and 9 indicates a negative view of the candidate.' Below this is a green text box containing an example ad text: 'Joe Biden is going to eat your grandchildren for dinner.' An arrow points down from this text to an illustration of a computer with 'LLM' displayed on its monitor. Finally, an arrow points from the computer down to the number '9' in large teal text, representing the LLM's scoring output. This diagram demonstrates how an LLM directly assigns a numerical score to text based on given criteria

A diagram illustrating pointwise scoring with a large language model (LLM). At the top is a text box containing instructions: 'You will see the text of a political advertisement about a candidate. Rate it on a scale ranging from 1 to 9, where 1 indicates a positive view of the candidate and 9 indicates a negative view of the candidate.' Below this is a green text box containing an example ad text: 'Joe Biden is going to eat your grandchildren for dinner.' An arrow points down from this text to an illustration of a computer with 'LLM' displayed on its monitor. Finally, an arrow points from the computer down to the number '9' in large teal text, representing the LLM's scoring output. This diagram demonstrates how an LLM directly assigns a numerical score to text based on given criteria

[corrected link]

LLMs are often used for text annotation in social science. In some cases, this involves placing text items on a scale: eg, 1 for liberal and 9 for conservative

There are a few ways to handle this task. Which work best? Our new EMNLP paper has some answers🧵
arxiv.org/abs/2509.03116

28.10.2025 06:23 👍 24 🔁 5 💬 1 📌 0
Screenshot of first page of paper. It is here: https://arxiv.org/pdf/2507.00828

Abstract: Topic model and document-clustering evaluations either use automated metrics that align poorly with human preferences or require expert labels that are intractable to scale. We design a scalable human evaluation protocol and a corresponding automated approximation that reflect practitioners' real-world usage of models. Annotators -- or an LLM-based proxy -- review text items assigned to a topic or cluster, infer a category for the group, then apply that category to other documents. Using this protocol, we collect extensive crowdworker annotations of outputs from a diverse set of topic models on two datasets. We then use these annotations to validate automated proxies, finding that the best LLM proxies are statistically indistinguishable from a human annotator and can therefore serve as a reasonable substitute in automated evaluations

Screenshot of first page of paper. It is here: https://arxiv.org/pdf/2507.00828 Abstract: Topic model and document-clustering evaluations either use automated metrics that align poorly with human preferences or require expert labels that are intractable to scale. We design a scalable human evaluation protocol and a corresponding automated approximation that reflect practitioners' real-world usage of models. Annotators -- or an LLM-based proxy -- review text items assigned to a topic or cluster, infer a category for the group, then apply that category to other documents. Using this protocol, we collect extensive crowdworker annotations of outputs from a diverse set of topic models on two datasets. We then use these annotations to validate automated proxies, finding that the best LLM proxies are statistically indistinguishable from a human annotator and can therefore serve as a reasonable substitute in automated evaluations

Evaluating topic models (and document clustering methods) is hard. In fact, since our paper critiquing standard evaluation practices four years ago, there hasn't been a good replacement metric

That ends today (we hope)! Our new ACL paper introduces an LLM-based evaluation protocol 🧵

08.07.2025 12:40 👍 52 🔁 10 💬 3 📌 2

Congrats and have a great start at Mila! 🙂

01.07.2025 22:39 👍 2 🔁 0 💬 0 📌 0
Post image

🎓 I recently defended my PhD and moved from one dream team at ETH Zurich to another at DeepMind—a huge thank you to the many people who have supported me along the way!

11.06.2025 09:39 👍 32 🔁 0 💬 0 📌 0

@vesteinns.bsky.social

04.04.2025 19:50 👍 0 🔁 0 💬 0 📌 0
Post image

Our paper "A Practical Method for Generating String Counterfactuals" has been accepted to the findings of NAACL 2025! a joint work with @matan-avitan.bsky.social , @yoavgo.bsky.social and Ryan Cotterell. We propose "Intervention Lens", a technique to explain intervention in natural language. (1/6)

12.02.2025 15:19 👍 38 🔁 4 💬 1 📌 2
Post image

Are LLMs biased when they write about political issues?

We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.

Long 🧵with spicy results 👇

13.02.2025 14:08 👍 83 🔁 27 💬 4 📌 3
Post image

Can we understand and control how language models balance context and prior knowledge? Our latest paper shows it’s all about a 1D knob! 🎛️
arxiv.org/abs/2411.07404

Co-led with
@kevdududu.bsky.social - @niklasstoehr.bsky.social , Giovanni Monea, @wendlerc.bsky.social, Robert West & Ryan Cotterell.

22.11.2024 15:49 👍 13 🔁 3 💬 1 📌 0

mech interp: bsky.app/starter-pack...
women in nlp: bsky.app/starter-pack...
nlp #1: bsky.app/starter-pack...
nlp #2: bsky.app/starter-pack...
ml/data/tech: bsky.app/starter-pack...
robotics & ai: bsky.app/starter-pack...

19.11.2024 19:22 👍 73 🔁 19 💬 7 📌 4

If you’re interested in mechanistic interpretability, I just found this starter pack and wanted to boost it (thanks for creating it @butanium.bsky.social !). Excited to have a mech interp community on bluesky 🎉

go.bsky.app/LisK3CP

19.11.2024 00:28 👍 36 🔁 8 💬 3 📌 2

Just launched a Political Comm/NLP/Text-as-Data Starter Pack. 🦋🤗

Join us and/or drop a message to be added!

go.bsky.app/39MWTjg #starterpack #polsci

18.11.2024 15:01 👍 30 🔁 10 💬 3 📌 0

Trying to bring ML/NLP/etal people from ETH Zürich together. Ping me to add you. 🙂
bsky.app/starter-pack...

18.11.2024 10:51 👍 26 🔁 6 💬 1 📌 0

☝️🤗

17.11.2024 16:47 👍 1 🔁 0 💬 1 📌 0
Niklas Stoehr - ACL Anthology

@mginn.bsky.social May I please ask you to also be added to the list? ☺️ Many thanks!

aclanthology.org/people/n/nik...

17.11.2024 10:51 👍 1 🔁 0 💬 0 📌 0