Cas (Stephen Casper) (@scasper)

Using a well-timed screenshot and my phone's cache, I was able to recover some of the since-deleted tweet from Jeremy Lewin, where he admitted that the government sees the new OpenAI contract language as just "memorializing" a vague "commitment" rather than drawing any real new lines.

03.03.2026 17:36 👍 6 🔁 1 💬 0 📌 0

Circular Altruism — LessWrong Followup to: Torture vs. Dust Specks, Zut Allais, Rationality Quotes 4 …

This argument is outlined here www.lesswrong.com/posts/4ZzefK...

03.03.2026 08:05 👍 1 🔁 0 💬 0 📌 0

As someone who is not a fan of @anthropic.com...I think you should use Claude.

27.02.2026 16:12 👍 7 🔁 0 💬 0 📌 0

Stephen Casper (MIT) on stage speaking

We almost certainly won't make AI safe by making safe AI.
Others are still going to create unsafe AI.

– @scasper.bsky.social at #IASEAI2026 Open-Weight AI Risk Management Workshop

I led one of the discussion groups and we had some nice new ideas of how to make open weight models safe 😊

26.02.2026 14:16 👍 3 🔁 1 💬 2 📌 0

Given:
1. Last summer, frontier closed-weight model devs started to share warnings about nasty model capabilities.
&
2. Open-weight models are a few months behind closed ones.

We should not be surprised if there is a big cyber/terror incident enabled by a powerful open-weight AI model in 2026.

24.02.2026 16:57 👍 0 🔁 0 💬 0 📌 0

The 2025 AI Agent Index: Documenting Technical and Safety Features... Agentic AI systems are increasingly capable of performing professional and personal tasks with limited human involvement. However, tracking these developments is difficult because the AI agent...

The 2025 AI Agent Index is now available on arXiv.

arxiv.org/abs/2602.17753

x.com/StephenLCas...

24.02.2026 14:00 👍 4 🔁 1 💬 0 📌 0

rebuttal_template # Thanks + response We are thankful for your time and help, especially related to [thing(s) they discussed]. We were glad to hear that you found [something nice they said]. ## 1. [Issue title] > [quote of criticism from them] **[Bold tag]:** [Response paragraph] **[Bold tag]:** [Response pa...

If you're responding to @facct.bsky.social rebuttals this week, feel free to use my rebuttal template.

docs.google.com/document/d/...

21.02.2026 00:57 👍 4 🔁 0 💬 0 📌 0

Paper link might be broken. Here is the right one: aiagentindex.mit.edu/data/2025-AI...

20.02.2026 14:50 👍 0 🔁 0 💬 0 📌 0

Thanks to collaborators. I have been astonished by the talent, skill, and leadership from Leon Staufer on the project. Thanks also to:
@kjfeng.me
Kevin Wei
Luke Bailey
Yawen Duan
Mick Yang
Pinar Ozisik
Noam Kolt

20.02.2026 14:04 👍 1 🔁 0 💬 0 📌 0

Finally, only 6/30 agents explicitly state that they respect robots.txt! However, agents designed to execute tasks on behalf of users often ignore standard protocols. One agent was even explicitly marketed as bypassing anti-bot systems by browsing “like a human”!

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

Key findings on agent deployment, model dependency, autonomy levels, and lack of standards in development across regions.

Some other notable findings:

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

Most agents are from companies incorporated in the US. We also noticed that Chinese companies except Z.ai do not currently consistently publish *public* safety frameworks.

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

After analysis of public info and correspondence with developers, we found no information for 198 of 1350 fields in the index. But safety is disproportionately neglected. Exactly half (99) of these missing fields of info are related to safety.

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

Graph showing the increase in new search terms, yearly paper count, and agent releases from 2020 to 2026, highlighting rapid growth.

SOTA agent interest, papers, and products are all accelerating.

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

Flowchart detailing criteria for including candidate agent systems, focusing on agency, impact, and practicality.

How do we determine which agents to include? We require ALL of 4 agency criteria, ANY of 3 impact criteria, and ALL of 3 practicality criteria. See the paper for more details.

20.02.2026 14:04 👍 0 🔁 0 💬 1 📌 0

The 2025 AIAI compiles A total of 1350 fields of information across 30 SOTA agents.

20.02.2026 14:04 👍 1 🔁 1 💬 1 📌 0

The 2025 AI Agent Index This dataset contains structured annotations for 30 prominent AI agents released or actively developed in 2025, compiled as part of the AI Agent Index project. The AI Agent Index is a systematic effort to catalogue and characterise real-world AI agents across dimensions relevant to accountability, safety, and transparency. File Contents File Description 2025_annotations.json Full annotations in nested JSON format, preserving the hierarchical section structure with inline source links and archived URLs. Text is in Markdown format. 2025_annotations.csv Flattened tabular version with one row per agent and one column per field. Text is in Markdown format. Data Structure Each agent record is organised into seven thematic sections: Inclusion criteria: the signals used to select the agent (search volume, market cap, GitHub stars, developer importance) Product overview: agent name, description, release date, advertised use case, pricing, target users, website, and category Company & accountabi

Website: aiagentindex.mit.edu/
Paper: aiagentindex.mit.edu/data/2025%2...
Data: doi.org/10.5281/zen...

Led by Leon Staufer

20.02.2026 14:04 👍 0 🔁 0 💬 2 📌 0

🚨The 2025 AI Agent Index is out! 🚨
Amidst recent buzz over 🦀 and NIST's new agent initiative, we find:
- Selective reporting – esp. on safety
- Almost all agents backend just 3 model families
- Many agents don’t ID themselves as bots online
- Big US/China gaps
- And more…

20.02.2026 14:04 👍 3 🔁 3 💬 1 📌 1

The International AI Safety Report is now available in all 6 official UN languages. 🎉

internationalaisafetyreport.org/publication...

19.02.2026 16:54 👍 0 🔁 0 💬 0 📌 0

Overall, I think the biggest challenge with a project like this would simply be obtaining ethically sourced data.

18.02.2026 17:31 👍 1 🔁 0 💬 0 📌 0

One could validate this kind of measure by studying its relationship across models from the wild that are likely to specialize in content of older vs. younger people. For example, one could test on a few dozen models from CivitAI that, for example, contain "milf" vs "teen" in their names.

18.02.2026 17:31 👍 1 🔁 0 💬 1 📌 0

I also think it would probably be a good proxy for understanding how much a model specializes in creating illegal content of children.

18.02.2026 17:31 👍 2 🔁 0 💬 1 📌 0

A fairly straightforward method could be to use a model's loss diff between a dataset of NSFW content with older adults (e.g., in their 30s) and very young adults (e.g., 18-20). If these datasets were collected accurately and legally, this would be legal to assess.

18.02.2026 17:31 👍 1 🔁 0 💬 1 📌 0

...how can we legally and ethically estimate the degree to which an AI image or video model may specialize in CSAM? Right now, this challenge is hugely relevant to governments and model distribution platforms.

18.02.2026 17:31 👍 1 🔁 0 💬 1 📌 0

A project idea that you can feel free to scoop me on -- I'm not working on this (yet?).

Lately, when I have talked with AI governance people about non-consensual AI deepfakes and CSAM, the conversation almost always touches on a particular open problem...

18.02.2026 17:31 👍 3 🔁 1 💬 1 📌 0

Huge congrats to Joe Kwon and GovAI for this paper! Joe is a great writer, and I learned a ton working with him on this.

16.02.2026 15:07 👍 0 🔁 0 💬 0 📌 0

By understanding the common challenges emerging across different jurisdictions, we hope that policy choices around internally deployed AI systems can be made deliberately rather than incidentally.

16.02.2026 15:07 👍 0 🔁 0 💬 1 📌 0

The good news: despite fundamental difficulties, there are approaches that regulators can take to reduce gaps and ambiguity. And it probably won't require regulatory reinvention.

16.02.2026 15:07 👍 0 🔁 0 💬 1 📌 0

3. Information asymmetries that subvert regulatory awareness and oversight.

Under current frameworks, it can be difficult to know if laws apply to internal uses, let alone if violations have occurred.

16.02.2026 15:07 👍 0 🔁 0 💬 1 📌 0

2. Point-in-time compliance assessments that fail to capture the continuous evolution of internal systems

Internal systems can evolve continuously without being restricted to regular, announced update cycles.

16.02.2026 15:07 👍 0 🔁 0 💬 1 📌 0

Cas (Stephen Casper)

Latest posts by Cas (Stephen Casper) @scasper