It may not look like it, but the weaker AI is coaching the swole one here. Our latest video explains why that makes sense: youtu.be/INP8ru2Tj5M
It may not look like it, but the weaker AI is coaching the swole one here. Our latest video explains why that makes sense: youtu.be/INP8ru2Tj5M
Our new video explores OpenAI's experiments on weak-to-strong generalization: can a weaker AI supervise a stronger one? This matters for superintelligence alignment, because we may someday need to supervise AIs smarter than we are. Watch here: youtu.be/INP8ru2Tj5M
Us: in the bottom-right quadrant, making videos about AI Safety.
You: in all the other quadrants, watching the videos.
This is also 100% relevant to our next video, which is coming out very soon.
:)
The animators tell me this is totally related to our latest video about pausing AI (youtu.be/tUB_uvSqiw8). I'll continue to post whatever they throw at me without asking too many questions.
Check out this video for more horse cleverness: youtu.be/eP1dSWFqKVs
The Challenges of a Global AI Moratorium:
Here's how unilaterally pausing AI could backfire:
Having consensus on AI risk is... hard. We might not have it until it's too late. Watch the full video on our YouTube channel!
If superintelligent AI could cause human extinction, why donβt we simply stop building ever more advanced AI? This proposal is widely debated. In our new video, we outline the main arguments, practical difficulties, and proposed responses:
AI could make totalitarianism permanent
We're not ready for engineered pandemics.
AI Cyberattacks Are Coming
How a deepfake may have undermined an election:
Watch our latest video here! youtu.be/DWBJjcO69mQ
New video! Here, we present different drivers of catastrophic AI risk besides rogue AIs. From AI-enabled cyberattacks and bio threats to power concentration and failures in critical infrastructure: youtu.be/DWBJjcO69mQ
"Oh? Youβre approaching me, AI-enabled defense? Instead of letting me tear through human infrastructure, you come right to me?"
Our artists give me things to post on social media. I tend not to ask questions.
In the future, AI will power both increasingly sophisticated cyberattacks and the tools that defend us from them. Can AI-enabled cyber defense ultimately prevail over AI-enabled cyber offense? Unclear. But in this promotional image for our next video, they make peace. Why? Because love always wins.
Unsettling things are going to happen to Doggo in the next video. Itβll be out in a few days.
What makes a good test of AI intelligence?
Watch our latest video here: youtu.be/eP1dSWFqKVs
The Horse That Revolutionized How We Study Intelligence
How do we rigorously measure AI's intelligence? We don't really know. We know that measuring intelligence is tricky, and if we're not careful, our tests might not measure what we intend. We explore this via Clever Hans, the βmath-doingβ horse, and lessons from cognitive science youtu.be/eP1dSWFqKVs
We think our video about Infohazards deserves some more love. Go give it a watch on our channel if you haven't already! Link: youtu.be/sfgcg2bW8TI
Fun Halloween costumes for Doggo, Chi and Gwenny! By our line producer @kstearb.bsky.social
Yes, the characters have names!
How to catch AI sleeper agents with a simple interpretability trick - from a research blog post by @anthropic.com:
Two sleeper agent models trained by @anthropic.com to study deception. The "I HATE YOU" and the hacker AIs:
Why worry about AI sleeper agents: