Jenna Russell's Avatar

Jenna Russell

@jennarussell

CS PhD Student @ UMD Undergrad @ Cornell https://jenna-russell.github.io/

978
Followers
395
Following
28
Posts
07.11.2024
Joined
Posts Following

Latest posts by Jenna Russell @jennarussell

Thanks to my amazing coauthors
@markar.bsky.social, Destiny Akinode, @kthai1618.bsky.social, Bradley Emi, Max Spero and @miyyer.bsky.social and the support of UMD Clip lab and Pangram Labs

22.10.2025 15:24 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We will be continuously monitoring American news to keep up with how AI use changes over time. Follow along at 🌐 ainewsaudit.github.io

22.10.2025 15:24 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 1
Preview
AI use in American newspapers is widespread, uneven, and rarely disclosed AI is rapidly transforming journalism, but the extent of its use in published newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from online...

We’re releasing:
🌐 Browse articles: ainewsaudit.github.io
πŸ“‚ Datasets (recent_news, opinions, ai_reporters): github.com/jenna-russe...
πŸ“„ Paper: arxiv.org/abs/2510.18774

22.10.2025 15:24 πŸ‘ 8 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0

AI has been creeping into the news all of us read, often without any disclosure. We call for clearly defined standards for U.S. newsrooms:
1️⃣ Clearly define what counts as acceptable use of AI and publish these standards openly
2️⃣ Require AI-use attestations for all writers

22.10.2025 15:24 πŸ‘ 15 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

Many AI-written stories still contain authentic quotes. We hypothesize that people often use AI for editing or expanding on their human-written work. But with no disclosure, there's no way to tell for sure.

22.10.2025 15:24 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

We also track how AI adoption has evolved over time:
Among 10 veteran reporters we followed longitudinally, AI use rose from 0% pre-ChatGPT (2022) to >40% in 2025.

22.10.2025 15:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

AI is disproportionately affecting news written in languages other than English. Roughly ~8% of English news is AI-generated, compared to 33% of non-English languages (primarily Spanish). Without disclosure, we cannot be sure whether AI is translating stories or writing them.

22.10.2025 15:24 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

In NYT, WaPo & WSJ, opinion sections show 6.4Γ— higher AI use than other sections, rising ~25Γ— since 2022 (from ~0% β†’ ~4%).
AI use is concentrated among prominent guest authors: politicians, CEOs, and scientists.

22.10.2025 15:24 πŸ‘ 4 πŸ” 2 πŸ’¬ 1 πŸ“Œ 2

Despite widespread use, transparency is basically nonexistent.
Out of 100 AI-flagged articles we manually annotated, only 5 disclosed that AI was used and over 90% of outlets have no public AI policy.

22.10.2025 15:24 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

AI use isn’t evenly distributed:
πŸ—žοΈ Far higher in small local papers than national outlets
🌎 Especially common in Mid-Atlantic & Southern states
🏒 Largely Driven by ownership groups (e.g. Boone Newsmedia & Advance Publications)
🧭 Most concentrated in weather, tech, and health

22.10.2025 15:24 πŸ‘ 3 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0
Preview
Pangram Labs AI Detection The most accurate technology to detect AI-generated content. Detects ChatGPT, Gemini, Meta AI, Claude, and more. Supports 20+ languages with 99.98%+ accuracy.

We detect AI using Pangram, a model with a reported false positive rate of 0.001% on news text. We find that 5.2% of recent news Is completely AI-generated, with another 3.9% partially AI-generated. www.pangram.com/

22.10.2025 15:24 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image

AI is already at work in American newsrooms.

We examine 186k articles published this summer and find that ~9% are either fully or partially AI-generated, usually without readers having any idea.

Here's what we learned about how AI is influencing local and national journalism:

22.10.2025 15:24 πŸ‘ 55 πŸ” 29 πŸ’¬ 5 πŸ“Œ 2
Post image

πŸ€” What if you gave an LLM thousands of random human-written paragraphs and told it to write something new -- while copying 90% of its output from those texts?

🧟 You get what we call a Frankentext!

πŸ’‘ Frankentexts are surprisingly coherent and tough for AI detectors to flag.

03.06.2025 15:09 πŸ‘ 34 πŸ” 8 πŸ’¬ 1 πŸ“Œ 1

International students will stop coming to American universities if their visas are going to be at risk. This will make our intellectual community poorer and also make tuition more expensive for domestic students.

08.04.2025 01:52 πŸ‘ 595 πŸ” 165 πŸ’¬ 7 πŸ“Œ 16

There is a quasi-religion in Silicon Valley that views AI as godlike. This faith has always been parallel to Evangelical Christianity: salvation (transhumanism), the rapture (the technological singularity), and demons (Roko's Basilisk)

Lately the AI faith has fully fused with Christian Nationalism.

21.03.2025 22:51 πŸ‘ 5985 πŸ” 1422 πŸ’¬ 101 πŸ“Œ 257
Post image

Introducing 🐻 BEARCUBS 🐻, a β€œsmall but mighty” dataset of 111 QA pairs designed to assess computer-using web agents in multimodal interactions on the live web!
βœ… Humans achieve 85% accuracy
❌ OpenAI Operator: 24%
❌ Anthropic Computer Use: 14%
❌ Convergence AI Proxy: 13%

12.03.2025 14:00 πŸ‘ 11 πŸ” 5 πŸ’¬ 1 πŸ“Œ 3
Post image

Is the needle-in-a-haystack test still meaningful given the giant green heatmaps in modern LLM papers?

We create ONERULER πŸ’, a multilingual long-context benchmark that allows for nonexistent needles. Turns out NIAH isn't so easy after all!

Our analysis across 26 languages πŸ§΅πŸ‘‡

05.03.2025 17:06 πŸ‘ 14 πŸ” 5 πŸ’¬ 1 πŸ“Œ 3
Post image

⚠️Current methods for generating instruction-following data fall short for long-range reasoning tasks like narrative claim verification.

We present CLIPPER βœ‚οΈ, a compression-based pipeline that produces grounded instructions for ~$0.5 each, 34x cheaper than human annotations.

21.02.2025 16:25 πŸ‘ 21 πŸ” 8 πŸ’¬ 1 πŸ“Œ 2

Also, the non experts have a range of LLM usage. Having a writing background is key, and a fact many are missing.

29.01.2025 12:39 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hi Shane. We originally used 5 people, only 1 of whom could detect AI-generated text. I then searched out people who I thought could be experts and they had to pass multiple rounds of testing to be included in the study. Details in appendix. Nonexpert performance is already widely known.

29.01.2025 12:39 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This is a great question - we didn’t dive deeper than choosing articles from American publications. There were a few mentions where experts mentioned this awkward phrasing and thought it could be a non-native speaker, but still knew it was a human!

29.01.2025 12:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

It would be very interesting to see if every language had their own set of β€œAI vocab” words 🀣

29.01.2025 00:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I think importantly is user who do writing tasks like editing/publishing! It’s the mix of having great language skills and frequent usage. Alot of ppl who just use LLMs a lot are way worse detectors than they think they’ll be.

29.01.2025 00:15 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text In this paper, we study how well humans can detect text generated by commercial LLMs (GPT-4o, Claude, o1). We hire annotators to read 300 non-fiction English articles, label them as either human-writt...

Link found in last post of thread πŸ˜€ (but putting it here again) arxiv.org/abs/2501.15654

28.01.2025 15:45 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
GitHub - jenna-russell/human_detectors Contribute to jenna-russell/human_detectors development by creating an account on GitHub.

πŸ“Ž Paper: arxiv.org/abs/2501.15654
πŸ‘©β€πŸ’» Code & Data: github.com/jenna-russe...

Thanks to my amazing coauthors @markar.bsky.social and @miyyer.bsky.social and the support of UMass NLP

28.01.2025 14:55 πŸ‘ 9 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

We're releasing our dataset of articles and expert annotations! πŸ“‚βœ¨
We hope this helps users of automatic detectors understand not just if a text is AI-generated, but why. πŸ€–πŸ“–

28.01.2025 14:55 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Can LLMs mimic human expert detectors? πŸ€”

We prompted LLMs to imitate our expert annotators. The results show promise, outperforming detectors like Binoculars and RADAR. πŸš€ However, LLMs still fall short of matching our human experts and advanced detectors like Pangram. βš–οΈπŸ‘₯

28.01.2025 14:55 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

What they get wrong: ❌

Sometimes, humans get tripped up by:
πŸ“š Common "AI vocab" words in human-written texts
✍️ Grammar mistakes they assume "AI wouldn’t make"
πŸŒ€πŸ—£οΈ One expert was often fooled by o1's use of informal language - like slang, contractions, and colloquialisms.

28.01.2025 14:55 πŸ‘ 7 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Post image

What experts get right: βœ…

They spot telltale signs of AI, like:
πŸ“š "AI Vocab" (delve, crucial, vibrant ...)
πŸ”„ Predictable sentence structure
πŸ—¨οΈ Quotes that feel too polished

For human-written content, they look for:
🎨 Creativity
🎭 Stylistic quirks
🌊 A natural & clear flow

28.01.2025 14:55 πŸ‘ 13 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0
Post image

Across GPT-4o, Claude, and o1 articles, experts correctly identified 99.3% of AI-generated content without misclassifying any human-written articles.πŸ•΅οΈβ€β™€οΈ

Among automatic detectors, Pangram significantly outperformed the rest, missing only a few more texts than the experts. πŸ”βš‘

28.01.2025 14:55 πŸ‘ 10 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0