Our new paper is out today in @pnasnexus.org with colleagues at Yale (@matthewshu.com, Danny Karell, @keitarookura.bsky.social)
We wanted to understand how using AI-generated summaries to learn about history influenced attitudes compared to existing resources like Wikipedia. 1/4
03.03.2026 16:55
π 16
π 7
π¬ 1
π 1
Finally blogged about my paper (led by @zarine.net) that seeks to explain why Croatian Wikipedia spent a decade captured by a cabal of political extremists and became a site for Holocaust revisionism, while other similar Wikipedia languages seemed to have fared much better. mako.cc/copyrighteou...
22.02.2026 21:45
π 11
π 4
π¬ 0
π 0
Interoperability as Equity: Collaborative Cultural Heritage Knowledge Graphs as a Tool to Shape Inclusive Ontologies | Journal of Open Humanities Data
Very happy to share that our new paper, "Interoperability as Equity: Collaborative Cultural Heritage Knowledge Graphs as a Tool to Shape Inclusive Ontologies" is out! We are discussing linked open data, ontologies, Wikidata and interoperability with CIDOC CRM
doi.org/10.5334/johd...
02.02.2026 13:29
π 1
π 4
π¬ 0
π 0
A paper screenshot:
Refractive datasets as a sensemaking methodology in closed data ecosystems
Anna Beers, Viviane Ito, Agustin Orozco, Patrick Gildersleve, Pablo AragΓ³n, and Francesca Tripodi
Abstract
As digital platforms restrict their APIs, researchers face diminishing options for studying social phenomena in digital environments. During what has been called the post-API era, researchers have found themselves looking for reliable data sources in an unreliable and frequently changing platform data ecosystem. In this context, we propose analyzing refractive datasets as a methodology for researchers to understand the dynamics of closed data platforms. Refractive datasets come from platforms with relatively more open data policies, and their analysis sheds light on platforms with more restrictive data policies. Like a prism, refractive datasets reflect but also transform data-based phenomena unfolding on closed platforms. Using refractive datasets from Wikipedia and Google Trends, we present three studies to demonstrate our methodology. We first show how refractive data from Wikipedia's multiple language editions can be used to understand a fractured global platform ecosystem in a case study of hydroxychloroquine, a purported COVID-19 medicine. Second, we use Google Trends to show how similar refractive analyses can be used to understand information lost to platform deletion, in a profile of an online panic over the drug brand Galaxy Gas. Finally, we show how Wikipedia data can be used as a grounding point for a refractive analysis of how new generative algorithms reproduce and distort data across the social web. We discuss how refractive datasets can be a way for researchers to βsensemakeβ in increasingly opaque big data environments, enabling interpretivist analyses which aim to generate new hypotheses rather than verify existing claims.
Happy 25th birthday to Wikipedia! π₯³
A fitting moment to share
1. Their great site to mark the occasion: wikipedia25.org
2. A paper in Big Data & Society, published over the winter break, where we develop Wikipedia as a βRefractive Datasetβ, led by @beeeeeers.bsky.social: doi.org/10.1177/2053...
15.01.2026 20:53
π 6
π 2
π¬ 0
π 0
ICYMI: Finally blogged about an old paper led by @kayleachampion.bsky.social that developed a new method (forensic qualitative analysis) to understand the nature and value of @torproject.org users' contributions to @wikipedia.org. mako.cc/copyrighteou...
01.02.2026 12:34
π 1
π 2
π¬ 0
π 0
"The Block Log: 20 Years of Content Moderation on Wikipedia" rhododendrites.com/pdfs/The%20B...
07.01.2026 12:29
π 4
π 1
π¬ 0
π 0
Effects of Algorithmic Flagging on Fairness: Quasi-experimental Evidence from Wikipedia
Note: I have not published blog posts about my academic papers over the past few years. To ensure that my blog contains a more comprehensive record of my published papers and to surface these for fβ¦
ICYMI: Finally blogged about an "old" paper led by @groceryheist.cc that uses data from a @wikipedia.org system to show how the introduction of a biased AI flagging system can still lead to more fairness because the humans without the system are even more biased. mako.cc/copyrighteou...
03.01.2026 12:54
π 12
π 3
π¬ 0
π 1
Iβm chuffed to share that Iβve been awarded this grant with @ftripodi.bsky.social and Brett Zehner π₯³
Weβll be studying how AI systems may reproduce or reinforce biases in Wikipedia, whether by extracting knowledge from the platform or by contributing content back to it. Excited to get started!
09.12.2025 15:39
π 10
π 2
π¬ 2
π 0
A few updates on our Grokipedia analysis: we expanded our sample to 20,000 most edited articles on Wikipedia. Linguistic & stylistic differences are the same as reported before (Generally, Grokipedia articles are longer, more difficult to read, and less referenced.)
@wikiresearch.bsky.social
08.12.2025 17:03
π 2
π 3
π¬ 1
π 0
abstract of the paper "What did Elon change? A comprehensive analysis of Grokipedia"
Elon Musk released Grokipedia on 27 October 2025 to provide an alternative to Wikipedia, the crowdsourced online encyclopedia. In this paper, we provide the first comprehensive analysis of Grokipedia and compare it to a dump of Wikipedia, with a focus on article similarity and citation practices. Although Grokipedia articles are much longer than their corresponding English Wikipedia articles, we find that much of Grokipedia's content (including both articles with and without Creative Commons licenses) is highly derivative of Wikipedia. Nevertheless, citation practices between the sites differ greatly, with Grokipedia citing many more sources deemed "generally unreliable" or "blacklisted" by the English Wikipedia community and low quality by external scholars, including dozens of citations to sites like Stormfront and Infowars. We then analyze article subsets: one about elected officials, one about controversial topics, and one random subset for which we derive article quality and topic. We find that the elected official and controversial article subsets showed less similarity between their Wikipedia version and Grokipedia version than other pages. The random subset illustrates that Grokipedia focused rewriting the highest quality articles on Wikipedia, with a bias towards biographies, politics, society, and history. Finally, we publicly release our nearly-full scrape of Grokipedia, as well as embeddings of the entire Grokipedia corpus.
back again to share a new preprint from me and @mantzarlis.com! βWhat did Elon Change? A comprehensive analysis of Grokipediaβ arxiv.org/abs/2511.09685
I had seen many spot analyses of individual grokipedia pages, but I was curious: how was grokipedia made? what did Elon change from wikipedia?
17.11.2025 16:10
π 12
π 9
π¬ 1
π 2
Grokipedia cites a Nazi forum and fringe conspiracy websites
A site-wide comparison with Wikipedia sheds light on what Elon Musk is trying to do
Key points in new Cornell Tech research:
56% of Grokipedia entries carry the Wikipedia CC license, suggesting wholesale ingestion
Grokipediaβs top 100 sources include fewer news outlets and more UGC (e.g. LinkedIn scraping)
Grokipedia has fewer citations overall, making it harder to check sources
13.11.2025 14:17
π 14
π 8
π¬ 0
π 0
#Grokipedia set out to βfixβ #Wikipedia.
Turns out it mostly rewrites it, longer, slicker, less sourced.
Fluent, but fragile. @wikiresearch.bsky.social
31.10.2025 21:26
π 6
π 2
π¬ 2
π 0
"Investigating extreme cases in Wikipedia talk pages: Some insights on user behaviours"
uplopen.com/chapters/eβ¦
e.g. "the most prolific users, the longest threads (in terms of total duration, number of posts or number of distinct users involved) and the longest monologues"
15.10.2025 00:31
π 3
π 0
π¬ 1
π 1
Using a Wikipedia edit-a-thon as a cross-curricular STEM representation assignment - Discover Education
Background Wikipedia is a highly used, free, online encyclopedia with known gender disparities across its biography content. Editing Wikipedia has entered STEM classrooms as a writing-focused and sometimes equity-focused assignment. This paper presents a Wikipedia edit-a-thon event at the Wentworth Institute of Technology in Boston, Massachusetts focused on improving articles about women in STEM. This edit-a-thon promoted cross-disciplinary collaboration and community building with faculty and undergraduate students across eleven courses and disparate disciplines and offices at the university. Results Edit-a-thon attendees edited pages on women in STEM and listened to five-minute lightning talks by women in the university community: students, former faculty, and administrators. The impacts of the event include the addition of more than 15,000 words and 100 references to more than 100 articles on Wikipedia. The event supported a variety of student learning outcomes in participating courses across disciplines in the sciences and humanities. Conclusions A Wikipedia edit-a-thon supported student learning across multiple subjects while contributing to underdeveloped biography articles about women in STEM and helping students find a voice in the Wiki space. The edit-a-thon has potential as a cross-curricular touchpoint and to support equity and representation work.
Seredinski, A., Litchock-Morellato, F., Lange, A. et al. Using a Wikipedia edit-a-thon as a cross-curricular STEM representation assignment. Discov Educ 4, 368 (2025). doi.org/10.1007/s442... #OpenAccess
30.09.2025 08:57
π 1
π 1
π¬ 0
π 0
"Demographic disparity in Wikipedia coverage: a global perspective" (top 12 languages) epjdatascience.springeropen.com/articles/1β¦
- Women slightly overrepresented (not underrepresented) among living article subjects since ~2015, but still have shorter articles
- Developing countries overrepresented
11.10.2025 05:29
π 4
π 1
π¬ 0
π 1
"Investigating How LLMs Impact Participation in [Wikipedia]" (interviewing 16 editors) https://arxiv.org/abs/2509.07819v1
ChatGPT etc "enhance contribution quality for experienced editors" & "lower entry barriers for newcomers", but newbies struggle to align LLM outputs w Wikipedia policies
04.10.2025 01:14
π 3
π 0
π¬ 0
π 0
The Graphic User Interface of WikiTextGraph
New paper alert: WikiTextGraph β an open-source Python package for extracting the text and building multilingual Wikipedia link networks.
With: @gustavoschwartz.bsky.social , Juan Luis SuΓ‘rez
Paper: openresearchsoftware.metajnl.com/articles/10....
@wikiresearch.bsky.social #wikipedia #software
17.09.2025 13:32
π 1
π 2
π¬ 0
π 0
Critical Wikimedia Research Bibliography - Meta-Wiki
With the school year approaching, a number of scholars and myself have assembled together a Critical Wikimedia Research Bibliography. If you are teaching a course or doing research, we think you might find some good resources here. meta.wikimedia.org/wiki/Critica...
27.08.2025 21:25
π 3
π 2
π¬ 0
π 0
A manifesto for Wikimedia research: Critically studying Wikimedia as infrastructure
I am pleased to announce the launch of the Manifesto for Wikimedia Research manifesto.wiki. As my co-authored Big Data & Society commentary explains, the manifesto is dedicated to a humanist and critical tradition of taking Wikipedia's importance seriously. journals.sagepub.com/doi/10.1177/...
08.07.2025 13:17
π 10
π 5
π¬ 1
π 0
Presenter (Patrick Gildersleve) in front of a screen summarising the WikiReddit Dataset project. The slide describes it as "Every Wikipedia mention and link on Reddit, 2020-2023", includes some example usage, describes the scale of the dataset, and offers suggested use cases.
Had a great time meeting everyone and seeing all the interesting work @icwsm.bsky.social. I presented our study on the Wikireddit dataset - exploring Wikipediaβs role in fact-checking, discussion, and cross-platform attention on the web. Thank you to the organisers!
π: ojs.aaai.org/index.php/IC...
26.06.2025 10:08
π 8
π 3
π¬ 0
π 0
The Challenge of Peer-Produced Websites | UW College of Arts & Sciences
Communication professor Benjamin Mako Hill studies why successful peer-produced websites (like Wikipedia) eventually struggle to maintain their openness to new contributors.
UW published this really nice article about my work on governance challenges and lifecycles faced by peer-produced online communitiesβthe work supported by my NSF CAREER grant. Check it out if you want to know what I've been thinking about and working on!
15.06.2025 15:50
π 28
π 8
π¬ 3
π 0
DesambiguaciΓ³n en Wikipedia: exploraciΓ³n de los mecanismos de control de autoridades en la enciclopedia colaborativa por @florenciac.bsky.social y @tsaorin.bsky.social en #revistainfonomy
doi.org/10.3145/info...
#Controldeautoridades #Vocabularioscontrolados #Wikipedia
19.05.2025 10:15
π 3
π 2
π¬ 0
π 0
Been a hectic semester for me but made it through π a few updates
Had a blast as a GSI for @dbamman.bsky.social NLP class. Was a wonderful experience π
Won the Wikipedia Foundation Research of The Year Award for our CHI paper(doi.org/10.1145/3613...) with @schasins.bsky.social and John Canny
27.05.2025 19:22
π 6
π 4
π¬ 3
π 1
findings: (1) Wikipedia is most frequently cited by news and science websites for informational purposes, while commercial websites reference it less often. (2) The majority of Wikipedia links appear within the main content rather than in boilerplate [3/5 of https://arxiv.org/abs/2505.15837v1]
23.05.2025 06:00
π 1
π 1
π¬ 1
π 0
WikiWorkshop 2025 Recap - Rhododendrites
I like the internet
Whipped up a #WikiWorkshop 2025 recap blog post here: rhododendrites.com/posts/WikiWo... @wikiresearch.bsky.social Some really interesting tools, methods, and studies over the last couple days!
23.05.2025 17:34
π 2
π 1
π¬ 0
π 0
Is Wikipedia a cesspool of antisemitism? Don't trust the ADL's answer.
The ADL would have us believe Wikipedia is riddled by antisemitism. The reality is more complicated, writes a scholar whom the ADL has cited.
A recent ADL report claimed to find broad, systemic evidence of antisemitism on Wikipedia, prompting two dozen members of Congress to call into question the site's approach to moderating content related to Jews.
Some researchers cited by the ADL say their findings have been misconstrued.
16.05.2025 15:22
π 7
π 3
π¬ 1
π 1