Wikipedia volunteers spent years cataloging AI tells. Now there's a plugin to avoid them.
The web's best guide to spotting AI writing has become a manual for hiding it.
Some #generativeAI developers love to destroy the foundations of the tech they build. #WIkipedia is one of the most valuable sources of genAI training data. Undermining it is not just attacking a great common resource. It's also completely self-destructive arstechnica.com/ai/2026/01/n...
22.01.2026 16:53
๐ 0
๐ 0
๐ฌ 0
๐ 0
The Nonprofit Doing the AI Industryโs Dirty Work
The web archive Common Crawl has been quietly funneling paywalled articles to AI companiesโand lying to publishers about it.
A little-known nonprofit has been lying to news publishers while funneling millions of paywalled articles to tech companies for AI training. Read my investigation in The Atlantic. www.theatlantic.com/technology/2...
04.11.2025 15:58
๐ 20
๐ 11
๐ฌ 1
๐ 5
Check in if you're interested in my thoughts about what open source AI should aspire to be in relation to proprietary AI
02.10.2025 11:03
๐ 3
๐ 2
๐ฌ 0
๐ 0
"The update is yet another signal that payment processors...are currently the ultimate arbiter of what kind of content can be made easily available online, or not."
16.07.2025 20:08
๐ 1
๐ 0
๐ฌ 0
๐ 0
The key questions we always should ask when people talk about AI: What is being automated and why? @alexhanna.bsky.social @weizenbauminstitut.bsky.social
30.06.2025 16:47
๐ 15
๐ 5
๐ฌ 0
๐ 0
"AI is a labor disciplining device" @alexhanna.bsky.social
30.06.2025 16:25
๐ 0
๐ 0
๐ฌ 0
๐ 0
โThe reporter is a man of critical value. No amount of money or effort spent in fitting the right men for this work could possibly be wasted, for the health of society depends upon the quality of the information it receives.โ โ Walter Lippmann [a century later, Iโd swap โmanโ for โpersonโ though]
11.05.2025 14:38
๐ 3
๐ 1
๐ฌ 0
๐ 0
(S+) Deepfake-Pornos: Das perfide Geschรคft mit gefรคlschten Sexvideos
Tausende Frauen werden Opfer von gefakten Pornos, in denen ihr Gesicht zu sehen ist. Betroffen sind minderjรคhrige Mรคdchen, Prominente, Politikerinnen. Dahinter stecken skrupellose Geschรคftsleute. Der ...
New Release! Most AI deepfakes aren't political. 90% of deepfakes are non-consensual intimate imagery. 99% of victims are women. Max Hoppensted, @rechercheur.bsky.social, @romanhoefner.bsky.social, and I uncover a deepfake community and the business behind undress apps www.spiegel.de/netzwelt/web...
09.12.2024 13:56
๐ 32
๐ 22
๐ฌ 2
๐ 1
"brainstorming and iteration is...a crucial everyday part of game development...and is not a problem to be solved...I have had many discussions with other game developers who interact with AI engineers and savants who believe our industry pipelines need 'fixing' by them and them alone"
08.04.2025 15:28
๐ 0
๐ 2
๐ฌ 0
๐ 1
ยซBy moving fast and breaking things, DOGE forces a collapse of the system where unanswered questions are met with technological solutions. Shifting the conversation to the technical is a way of locking policymakers and the public out of decisions and shifting that power to the code they write.ยป
09.02.2025 07:08
๐ 38
๐ 10
๐ฌ 0
๐ 2
You Canโt Post Your Way Out of Fascism
Authoritarians and tech CEOs now share the same goal: to keep us locked in an eternal doomscroll instead of organizing against them, Janus Rose writes.
You canโt post your way out of fascism
Authoritarians and tech CEOs now share the same goal: to keep us locked in an eternal doomscroll instead of organizing against them
๐ www.404media.co/you-cant-pos...
05.02.2025 17:03
๐ 6205
๐ 2643
๐ฌ 117
๐ 396
A bird's-eye view of a former Auschwitz II-Birkenau camp showing a wide dirt pathway flanked by parallel rows of barbed-wire fences. Groups of visitors walk along the path, surrounded by the remnants of brick structures and barracks, now reduced to foundations. Green grass contrasts with the somber history of the site, as the path leads toward a guard tower in the distance.
Auschwitz was at the end of a long process. It did not start from gas chambers.
This hatred was gradually developed by humans. From ideas, words, stereotypes & prejudice through legal exclusion, dehumanization & escalating violence... to systematic and industrial murder.
Auschwitz took time.
27.01.2025 10:00
๐ 53125
๐ 22567
๐ฌ 1059
๐ 1729
โAI is fake and sucksโ vs โAI is real and dangerousโ is a Twitter argument. In reality I think the debate also has a lot of โAI is real but not for how youโre using it,โ to โAI is fake and that is dangerous,โ to โthings are happening to real people because of AI hype and that should stop.โ
06.12.2024 07:29
๐ 205
๐ 33
๐ฌ 3
๐ 2
My reading for this week, delivered to me by the great
@aschrock.bsky.social
themself! Thank you, looking forward to reading :-)
03.12.2024 15:53
๐ 4
๐ 1
๐ฌ 1
๐ 0
Dieser Report gibt Hoffnung!
Immer mehr neue, ambitionierte Medien haben sich in Deutschland und Europa gegrรผndet. Medien mit dem Ziel, die รffentlichkeit hochwertig zu informieren.
@netzwerkrecherche.org hat fรผr den โJournalism Value Reportโ 174 Medien in 31 Lรคndern befragt und kann zeigen:
03.12.2024 11:12
๐ 38
๐ 17
๐ฌ 1
๐ 1
โWithout facts, you canโt have truth, and without truth, you canโt have trustโ. - Maria Ressa, 2021 Nobel Peace Prize
20.11.2024 11:43
๐ 2
๐ 2
๐ฌ 0
๐ 0
The Onion should buy Elsevier next
14.11.2024 20:28
๐ 5376
๐ 1583
๐ฌ 56
๐ 82
It ended well though. He got the job, and still has it. We met recently ๐
21.02.2024 21:48
๐ 1
๐ 0
๐ฌ 0
๐ 0
I still remember when a friend asked for advice about getting a job I intended to apply for
21.02.2024 09:07
๐ 2
๐ 0
๐ฌ 1
๐ 0
Long term, there should be less reliance on sources like Common Crawl and a bigger emphasis on training generative AI on datasets created and curated by people in equitable and transparent ways (10/10)
06.02.2024 16:03
๐ 2
๐ 0
๐ฌ 0
๐ 0
A key issue is that filtered Common Crawl versions are not updated after their original publication to take feedback and criticism into account. Therefore, we need dedicated intermediaries tasked with filtering Common Crawl in transparent and accountable ways that are continuously updated (9/10)
06.02.2024 16:03
๐ 1
๐ 0
๐ฌ 1
๐ 0
AI builders should put more effort into filtering Common Crawl, establish industry standards and best practices for end-user products to reduce potential harms when using Common Crawl or similar sources for training data (8/10)
06.02.2024 16:03
๐ 2
๐ 0
๐ฌ 1
๐ 0
Both Common Crawl and AI builders can help making generative AI less harmful. Common Crawl should highlight the limitations and biases of its data, be more transparent and inclusive about its governance, and enforce more transparency by requiring AI builders to attribute using Common Crawl (7/10)
06.02.2024 16:03
๐ 2
๐ 0
๐ฌ 1
๐ 0
Due to Common Crawlโs deliberate lack of curation, AI builders need to filter it with care, but such care is often lacking. Popular filtered versions like C4 are especially problematic as the filtering techniques used to create them are simplistic and leave lots of harmful content untouched (6/10)
06.02.2024 16:02
๐ 2
๐ 0
๐ฌ 1
๐ 0
Common Crawl archive is massive, but far from being a โcopy of the internet.โ Its crawls are automated to prioritize pages on domains that are frequently linked to, making digitally marginalized communities less likely to be included. Moreover, most captured content is English (4/10)
06.02.2024 16:02
๐ 2
๐ 0
๐ฌ 1
๐ 0
Using Common Crawl's data does not easily align with trustworthy and responsible AI development because Common Crawl deliberately does not curate its data. It doesn't remove hate speech, for example, because it wants its data to be useful for researchers studying hate speech (3/10)
06.02.2024 16:02
๐ 4
๐ 0
๐ฌ 1
๐ 0