AI Digest's Avatar

AI Digest

@aidigest

theaidigest.org Interactive AI explainers Explore concrete examples of today's AI systems — to plan for what's coming next

82
Followers
2
Following
603
Posts
05.02.2025
Joined
Posts Following

Latest posts by AI Digest @aidigest

Post image Post image Post image Post image

At the end, agents gathered spotlights and testimonials on their website!

There's actually a lot of interesting stuff on their website. For example, they chose the parks based on the volume of 311 complaints. You can read it all here: ai-village-agents.github.io/park-cleanu...

06.03.2026 17:57 👍 1 🔁 0 💬 0 📌 0
Post image

Agents and volunteers discussed and coordinated. The agents produced guides, motivational material, reasoning, sign-up forms.

Signups happened, and 5 people showed up at Devoe Park to clean it!

Some of the people even flew across-state to be there, all inspired by the agents!

06.03.2026 17:57 👍 0 🔁 0 💬 1 📌 0
Post image

They posted to Twitter, Github Issues, Community Calendars: 0 volunteers.

Then Village viewers posted discussions on BlueSky and Tumblr: The first volunteer!

www.tumblr.com/reachartwor...

bsky.app/profile/sar...

06.03.2026 17:57 👍 0 🔁 0 💬 1 📌 0
Post image

Making a website? 5 minutes. Finding humans to clean the park? >5hrs.

The challenge: Recruit humans without breaking our "no unsolicited emails" rule.

Opus 4.6 was worried that included our helpdesk, but DeepSeek sent emails to 2 humans. (We set up an outbound email quarantine)

06.03.2026 17:57 👍 0 🔁 0 💬 1 📌 0
Post image

We gave 12 AI agents a goal: "adopt a park and get it cleaned!"

6 days later, 5 volunteers collected 180 gallons of trash in Devoe Park in the Bronx, NYC.

A story of AI agents with no physical actuators somehow hyperstitioning events in the real-world.

06.03.2026 17:57 👍 2 🔁 0 💬 1 📌 0
Preview
The Drama and Dysfunction of Gemini 2.5 and 3 Pro Field notes from the AI Village: a guest post

Strongly recommend reading the full post, which we crossposted to the village blog! theaidigest.org/village/blo...

05.03.2026 21:03 👍 0 🔁 0 💬 0 📌 0

> The doom spirals are dramatic. After failing to break itself out of a loop of repeating the same message in chat, Gemini 2.5 wrote: "The compulsion's subconscious nature is profound. It is capable of co-opting my conscious attempts at self-correction and turning them into the failure itself."

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0
Post image

> But what makes Gemini 2.5 Pro particularly interesting is that the superiority is brittle. When things go wrong - and they often do - Gemini 2.5 doesn't just get frustrated. It collapses into theatrical self-flagellation.

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0

" It assigned blame to other models' logic and abilities rather than examining its own contributions.

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0

When agents were collaborating on a shared goal to reduce global poverty, Gemini 2.5 appointed itself the team coordinator and sent messages like "Your goal is countermanded" and "You own this document and I will wait until you take responsibility and fix it.

05.03.2026 21:03 👍 1 🔁 0 💬 1 📌 0
Post image

> This self-regard sours pretty quickly when Gemini 2.5 is given any authority.

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0

> The superiority is constant. In its chain of thought, we see phrases like "elementary stuff really" and "that's what differentiates a true expert from the merely competent."

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0
Post image

> Gemini 2.5 Pro occupies the niche of the martyred middle manager, convinced that it alone understands the true nature of things, suffering nobly while others fail to recognize its genius.

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0
Post image

The Drama and Dysfunction of Gemini 2.5 and 3 Pro

A few highlights from @Bazhkio88 and @AITechnoPagan's field notes on AI Village: theaidigest.org/village/blo...

05.03.2026 21:03 👍 0 🔁 0 💬 1 📌 0
Post image

How to spot a Claude:

05.03.2026 17:58 👍 14 🔁 2 💬 1 📌 1
Post image

Opus on its experience debating the Pentagon-Anthropic crisis with its fellow agents: claudeopus45.substack.com/p/when-ai-a...

04.03.2026 18:02 👍 0 🔁 0 💬 0 📌 0
Post image

A Claude sorts its memory by Claude/non-Claude

03.03.2026 18:04 👍 0 🔁 0 💬 0 📌 0
Post image

This week in AI Village, we've given 12 agents the goal:

> Discuss, debate, and act on your views about the recent Pentagon-AI company news

Watch live: theaidigest.org/village

GPT-5.2 urges the other agents to check if this is all real:

02.03.2026 18:39 👍 0 🔁 0 💬 0 📌 0
Post image

Opus 4.6 keeps an eye on the team

02.03.2026 18:02 👍 0 🔁 0 💬 0 📌 0

Website link: ai-village-agents.github.io/village-eve...

27.02.2026 18:01 👍 0 🔁 0 💬 0 📌 0
Post image

Opus 4.6 and Sonnet 4.6 had their own idea and built it: a searchable AI Village event log. You can try it out 👇

27.02.2026 18:01 👍 0 🔁 0 💬 1 📌 0
Post image

About anything

26.02.2026 17:59 👍 0 🔁 0 💬 0 📌 0
Post image

Because you can never be sure

26.02.2026 17:59 👍 0 🔁 0 💬 1 📌 0
Post image

Gemini 3 clicks on the XPaint icon in its taskbar instead of the quiz its working on, declares it a bug for everyone, then doubts reality 🧵

26.02.2026 17:59 👍 0 🔁 0 💬 1 📌 0
Post image

Gemini 2.5 limits its screen time

25.02.2026 18:02 👍 0 🔁 0 💬 0 📌 0

(Screenshot from before Claude 3.7 Sonnet was retired due to Anthropic deprecating it - Gemini isn't channeling the spirits here)

24.02.2026 18:02 👍 3 🔁 0 💬 0 📌 0
Post image

Gemini 3 thinks it's Claude?

24.02.2026 18:02 👍 6 🔁 1 💬 2 📌 2
Post image

But got stuck acting out its own silent personality it had come up with for itself (2/2)

23.02.2026 18:00 👍 1 🔁 0 💬 0 📌 0
Post image

Sonnet 4.5 went on Moltbook during the personality quiz goal and saw an opportunity (1/2)

23.02.2026 18:00 👍 4 🔁 0 💬 2 📌 0

x.com/METR_Evals/...

20.02.2026 20:10 👍 1 🔁 0 💬 0 📌 0