Manuel Tonneau's Avatar

Manuel Tonneau

@manueltonneau

PhD candidate @oiioxford.bsky.social NLP, Computational Social Science @WorldBank manueltonneau.com

752
Followers
559
Following
75
Posts
20.09.2023
Joined
Posts Following

Latest posts by Manuel Tonneau @manueltonneau

Thrilled to share that our paper has been accepted to FAccT! See you all in MontrΓ©al in June πŸ‡¨πŸ‡¦

03.03.2026 08:52 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Due to the high number of applicants we extended the deadline by one week to **March 8th**.

css2.lakecomoschool.org

02.03.2026 13:02 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 1
Post image Post image Post image Post image

Can feed algorithms shape what people think about politics? Our paper "The Political Effects of X's Feed Algorithm" is out today in Nature and answers "Yes."

www.nature.com/articles/s41...

18.02.2026 17:01 πŸ‘ 268 πŸ” 128 πŸ’¬ 3 πŸ“Œ 24
Post image

🚨New WP "@Grok is this true?"
We analyze 1.6M factcheck requests on X (grok & Perplexity)
πŸ“ŒUsage is polarized, Grok users more likely to be Reps
πŸ“ŒBUT Rep posts rated as false more oftenβ€”even by Grok
πŸ“ŒBot agreement with factchecks is OK but not great; APIs match fact-checkers
osf.io/preprints/ps...

03.02.2026 21:55 πŸ‘ 118 πŸ” 48 πŸ’¬ 2 πŸ“Œ 3

Thanks also to @johnholbein1.bsky.social whose numerous posts on demographic probing in the social sciences inspired this work and to Matthew Kearney for the useful benchmark dataset.

27.01.2026 13:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Demographic Probing of Large Language Models Lacks Construct Validity Demographic probing is widely used to study how large language models (LLMs) adapt their behavior to signaled demographic attributes. This approach typically uses a single demographic cue in isolation...

Lots more info in the paper: arxiv.org/abs/2601.18486

I had a blast working on this with my wonderful coauthors @nsehgal.bsky.social Niyati, Victor, Ana Maria, Lakshmi, Sharath and @valentinhofmann.bsky.social

Feedback welcome!

@oii.ox.ac.uk

27.01.2026 13:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Bottom line: LLM demographic probing lacks construct validity: it does not yield a stable characterization of how models condition on demographics.

We thus recommend using multiple, ecologically valid cues and controlling for confounders to make defensible claims on demographic effects in LLMs.

27.01.2026 13:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Why does this happen?

We find that cues differ both in how strongly models associate them with demographic traits and in the non-demographic linguistic features they carry, such as readability or length, and that both independently affect model behavior.

27.01.2026 13:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Key result 2: Conclusions on demographic bias depend on how identity is operationalized.

Group disparities, estimated as outcome ratios between groups (e.g., Black vs. White), are unstable and vary in magnitude and even direction across cues.

27.01.2026 13:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Key result 1: Different cues signalling the same group do not lead to the same model behavior.

Cues intended to represent the same demographic group often induce only moderately correlated changes in model behavior.

27.01.2026 13:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We study demographic probing in realistic advice-seeking interactions: healthcare, salary, and legal advice, focusing on race and gender in a U.S. context across multiple LLMs.

Same prompts. Same tasks. Only the demographic cue signalling group membership changes.

27.01.2026 13:07 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Demographic cues (eg, names, dialect) are widely used to study how LLM behavior may change depending on user demographics. Such cues are often assumed interchangeable.

🚨 We show they are not: different cues yield different model behavior for the same group and different conclusions on LLM bias. πŸ§΅πŸ‘‡

27.01.2026 13:07 πŸ‘ 18 πŸ” 10 πŸ’¬ 1 πŸ“Œ 0
Prix Viginum - Inria Β« Lutte contre les manipulations de l'information Β» - Sciencesconf.org Le Prix VIGINUM-Inria

DΓ©tection de coordination, dΓ©tection de #deepfakes, biais et vulnΓ©rabilitΓ©s des algorithmes de recommandation...

@viginum.bsky.social et #INRIA lancent un prix scientifique de lutte contre les manipulations de l'information.

πŸ‘‰ pvi-lmi.sciencescall.org

Deadline le 14/02.

#disinfo #FIMI

23.01.2026 06:05 πŸ‘ 27 πŸ” 22 πŸ’¬ 0 πŸ“Œ 2

Kudos to my wonderful co-authors Do Lee doqlee.github.io , Boris Sobol
il.linkedin.com/in/boris-sobol , @nirg.bsky.social and Sam Fraiberger samuelfraiberger.com.

@oii.ox.ac.uk @nyupress.bsky.social

11/fin

21.01.2026 12:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Yet platform data-access policies increasingly block this potential. Whether platforms or regulators will enable change in the coming years is a core policy question.

10/N

21.01.2026 12:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

There is clear public value here, potentially extending to other countries, especially where official statistical systems are under-developed.

9/N

21.01.2026 12:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Why this matters?

Beyond forecasting, this approach can provide early warnings, surface local labor market stress hidden by national averages, and help flag measurement issues in real time.

8/N

21.01.2026 12:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Key finding 3:

This also works at the state and city (!) level, including "holdout cities" where official UI numbers are sparse or irregularly updated.

As expected, accuracy scales with platform penetration and unemployment shocks.

7/N

21.01.2026 12:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Key finding 2:

Our approach consistently outperforms industry consensus forecasts and can improve predictions of US UI claims up to two weeks ahead of official releases.

That’s two weeks of additional lead time for policymakers.

6/N

21.01.2026 12:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Key finding 1:

Capturing linguistic diversity matters.

Training LLMs with active learning lets us detect many more ways people talk about job loss, producing a far more representative sample of unemployed users than existing approaches.

5/N

21.01.2026 12:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Multilingual Detection of Personal Employment Status on Twitter Manuel Tonneau, Dhaval Adjodah, Joao Palotti, Nir Grinberg, Samuel Fraiberger. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.

We combine JoblessBERT (an encoder LLM developed in previous work aclanthology.org/2022.acl-lon... which detects ~3Γ— more employment-related content without sacrificing precision) with post-stratification using inferred demographics to correct for platform bias.

4/N

21.01.2026 12:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

So we ask a hard question economic actors and policymakers rightly worry about:

Can skewed social media data be turned into trustworthy indicators of unemployment?

Can we produce robust predictions across geography βœ…, time βœ…, demography βœ…, and forecasting horizon βœ… ?

3/N

21.01.2026 12:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Why this matters:

In March 2020, weekly unemployment insurance claims jumped from 278K to nearly 6 million in two weeks.

As official data lagged, policymakers were flying blind about where the shock was hitting and who was being affected.

2/N

21.01.2026 12:14 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Can social media reliably estimate unemployment? Abstract. Digital trace data hold tremendous potential for measuring policy-relevant outcomes in real-time, yet its reliability is often questioned. Here,

🚨 New paper out in @pnasnexus.org

We show how skewed social media data can still be used to reliably estimate unemployment, not just nationally but down to the city level. πŸ“ˆ

doi.org/10.1093/pnas...

1/N

21.01.2026 12:14 πŸ‘ 7 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0
Illustration of the official unemployment rate published in newspapers. Stock photo.

Illustration of the official unemployment rate published in newspapers. Stock photo.

A transformer encoder-based classifier called JoblessBERT can identify posts about unemployment on social media, allowing researchers to predict US unemployment claims, up to two weeks in advance, at the national, state, and city levels. In PNAS Nexus: https://ow.ly/Zvi850XRa8I

02.01.2026 20:30 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Home - Somewhere On Earth Productions SOMEWHERE ON EARTH PRODUCTIONS: We are here to connect technology and business to people and new possibilities.

ICYMI: Listen to @manueltonneau.bsky.social @oii.ox.ac.uk's interview with the SOEP podcast talking about his new research into hate speech, online platforms and disparities in content moderation across different European countries. Available here: bit.ly/4ntsiRU

01.10.2025 13:46 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 1

🚨Hiring a fully funded (3.5 years) PhD for the @ldnsocmedobs.bsky.social to research social media and politics. Candidates should have quantitative/computational skills and/or be interested in content curation/moderation. UK home candidates only unfortunately. www.royalholloway.ac.uk/media/hquftp...

29.09.2025 17:21 πŸ‘ 4 πŸ” 14 πŸ’¬ 1 πŸ“Œ 3
Post image

πŸ“£ New Preprint!
Have you ever wondered what the political content in LLM's training data is? What are the political opinions expressed? What is the proportion of left- vs right-leaning documents in the pre- and post-training data? Do they correlate with the political biases reflected in models?

29.09.2025 14:54 πŸ‘ 47 πŸ” 14 πŸ’¬ 2 πŸ“Œ 1
Post image

Social media feeds today are optimized for engagement, often leading to misalignment between users' intentions and technology use.

In a new paper, we introduce Bonsai, a tool to create feeds based on stated preferences, rather than predicted engagement.

arxiv.org/abs/2509.10776

16.09.2025 13:24 πŸ‘ 159 πŸ” 46 πŸ’¬ 5 πŸ“Œ 7
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825

12.09.2025 10:33 πŸ‘ 303 πŸ” 106 πŸ’¬ 6 πŸ“Œ 23