Designing AI agents to resist prompt injection
How ChatGPT defends against prompt injection and social engineering by constraining risky actions and protecting sensitive data in agent workflows.
OpenAI says prompt-injection attacks have evolved into social-engineering tactics and describes defenses—a social-engineering risk model plus source–sink analysis—with mitigations like Safe Url, sandboxing for ChatGPT Apps/Canvas, and safeguards in Atlas and Deep Research.
11.03.2026 23:31
👍 2
🔁 1
💬 0
📌 0
How does ChatGPT work? Or rather, language models in general- Part 1 attempting a lay explanation.
YouTube video by Casey Fiesler
I'm creating a series of short form videos about how language models work technically. The goal is to be something in between "you know it's next token prediction" and "now you've taken a machine learning class." I'd love your thoughts so here are the first few! 🧵
www.youtube.com/shorts/VZB8X...
08.03.2026 13:30
👍 145
🔁 38
💬 7
📌 2
we (acsresearch.org) expanded this into a larger paper! (my first.) we added some new experiments and found an interesting correlation - prompts that encourage the model to say there is an injection, even when there isn't one, correlate with better concept identification!
arxiv.org/abs/2602.20031
05.03.2026 01:06
👍 81
🔁 13
💬 3
📌 3
#ShareGoodNewsToo
26.02.2026 16:11
👍 177
🔁 76
💬 1
📌 1
New in Claude Code: Remote Control.
Kick off a task in your terminal and pick it up from your phone while you take a walk or join a meeting.
Claude keeps running on your machine, and you can control the session from the Claude app or http://claude.ai/code
24.02.2026 22:06
👍 102
🔁 9
💬 3
📌 28
New "boundary point jailbreaking" method against LLM safeguards (with prior disclosure to multiple labs) by using noised versions of harmful queries to turn sparse feedback from failed attacks into dense feedback. 🧵
www.aisi.gov.uk/blog/boundar...
17.02.2026 20:55
👍 44
🔁 5
💬 2
📌 3
Introducing Ai2 Open Coding Agents—starting with SERA, our first-ever coding models. Fast, accessible agents (8B–32B) that adapt to any repo, including private codebases. Train a powerful specialized agent for as little as ~$400, & it works with Claude Code out of the box. 🧵
27.01.2026 16:12
👍 128
🔁 22
💬 1
📌 7
From the minnesota community on Reddit: How You Can Help: MASTER LIST
Explore this post and more from the minnesota community
If you're not in Minnesota and curious how to help, this is a week old but is well organized, includes multiple easy-to-understand bullet points, and seems like a good place to start: www.reddit.com/r/minnesota/...
25.01.2026 12:15
👍 692
🔁 627
💬 4
📌 8
These people have absolutely zero ability to say "I was wrong, y'all were right."🤷🏿♂️
The closest they get is "I was right then when I said you were panicking for no reason and deranged, and I'm also right now, when I'm saying what you were saying, but way too late. I alone decide when it's right."🤡
25.01.2026 17:33
👍 473
🔁 102
💬 11
📌 3
Opinion | In Minneapolis, I Glimpsed a Civil War
Explore this gift article from The New York Times. You can read it for free without a subscription. www.nytimes.com/2026/01/19/o...
20.01.2026 00:44
👍 2
🔁 1
💬 0
📌 0
Statement by Federal Reserve Chair Jerome H. Powell
YouTube video by Federal Reserve
Video message from Federal Reserve Chair Jerome H. Powell:
www.youtube.com/watch?v=KckG...
www.federalreserve.gov/newsevents/s...
12.01.2026 00:35
👍 24025
🔁 9199
💬 1149
📌 2889
Just updated the Big LLM Architecture Comparison article...
...it grew quite a bit since the initial version in July 2025, more than doubled!
magazine.sebastianraschka.com/p/the-big-ll...
13.12.2025 14:22
👍 77
🔁 13
💬 1
📌 0
Re-upping this, since it's only available for two days:
11.12.2025 02:52
👍 356
🔁 177
💬 4
📌 0
You may have heard that in three weeks, the affordability provisions that make healthcare accessible for about 20 million Americans will expire.
Healthcare in the US is already very expensive. The lapse of these provisions will make it even more so.
But what exactly does that mean?
10.12.2025 13:18
👍 2
🔁 1
💬 1
📌 0
I mean it’s clearly a shit product…
05.12.2025 17:01
👍 1
🔁 0
💬 0
📌 0
11.10.2025 18:20
👍 114
🔁 29
💬 1
📌 0
If this stuff would end up on the news it would absolutely motivate voters. Plenty of people who don't give a shit about politics would absolutely give af about this.
09.10.2025 18:52
👍 2355
🔁 906
💬 72
📌 17
A two-part diagram comparing Agentic Reasoning and Agentic Reasoning with a world model.
⸻
Top: Agentic Reasoning
Flow:
1. Problem → Think → Action
2. Action interacts with the World, producing Env Feedback (environment feedback).
3. If the result is a Fail, the system loops back to Think → Action.
4. This cycle repeats until success.
Key point: Requires repeated trial-and-error with real-world feedback.
⸻
Bottom: Agentic Reasoning with a world model
Flow:
1. Problem → Think → into a World Model.
2. Inside the World Model:
• Imagine action
• Imagine Env Feedback
• Loops internally to refine (✔ or ✖ outcomes).
3. Only after internal simulation does it proceed to Action.
Key point: Uses imagination/simulation to test actions before execution, reducing failures in the real environment.
⸻
Contrast:
• Without world model = trial-and-error in reality.
• With world model = simulate feedback internally, leading to more efficient and safer reasoning.
Meta FAIR just released CWD: a dense 32B code world model
What’s a Code World Model? Well, it’s trained to know the effect of code, rather than just mimicking the semantics
hf: huggingface.co/facebook/cwm
paper: ai.meta.com/research/pub...
24.09.2025 23:38
👍 31
🔁 4
💬 2
📌 0
This. Is. Insane.
Read this, please.
26.06.2025 22:14
👍 5812
🔁 2606
💬 388
📌 156
An Onion front page with the headline: Congress, Now More Than Ever, Our Nation Needs Your Cowardice
Who will stand up for our democracy? This
question, fraught in even the most peaceful
times, has only grown more pressing as our
country approaches its 250th anniversary.
Each passing day brings growing assaults on
essential liberties like freedom of speech and due process.
Meanwhile, our delicately assembled legal system faces
a constant barrage of threats. Even as this issue reaches
publication, the U.S. military has been deployed against
peaceful protesters. We teeter on the brink of collapse into
an authoritarian state. That is why, today, The Onion calls
upon our lawmakers to sit back and do absolutely nothing.
Members of Congress, now more than ever, our nation
desperately needs your cowardice.
Our republic is a birthright, an exceedingly rare treasure
passed down from generation to generation of Americans. It
was gained through hard years of bloody resistance and can
too easily be lost. Our Founding Fathers, in their abundant
wisdom, understood that all it would take was men and
women of little courage sitting in the corridors of power
and taking zero action as this precious inheritance was
stripped away—and that is where we have finally arrived.
Now is not the time for bravery or valor! This is the time
for protecting your own hide and lining your pocket. Now is
not the time for listening to your idiotic constituents drone
on about what’s happening to their precious democracy.
This is the time for getting down on all fours and grov-
eling. Now is not the time to say, “Enough is enough,” and
have the tough conversations about resisting the ongoing
assaults on American liberty. This is the time to let the wave
of apathy and indifference roll over you as you think about
getting a really nice renovation to your house in Kalorama.
But what can I, one coward, do alone? you might ask.
Donald Trump just unilaterally bombed Iran. A masked gang is terrorizing our streets. America has rapidly devolved into an authoritarian state.
That's why, today, The Onion has purchased a full page ad in today's New York Times with a simple plea to Congress:
Sit back and do absolutely nothing.
22.06.2025 14:35
👍 15564
🔁 3882
💬 169
📌 184
Just a reminder going forward...
(cartoon from 2003)
22.06.2025 14:52
👍 6664
🔁 1867
💬 113
📌 81
13.02.2025 01:12
👍 3
🔁 0
💬 0
📌 0
Nazi salutes can't hurt you. Nazi policies can.
How're people more agitated by a nazi salute than by nazi policies? I don't get it.🤷🏿♂️
👨🏼"Well, now it's open fascism!"
Why is closed fascism any better? The same policies are happening.
Your assignment hasn't changed: Protect the most vulnerable.
20.01.2025 21:19
👍 159
🔁 35
💬 8
📌 0
Brandt from The Big Lebowski grimacing and trying to move on from an awkward moment with Bunny
Elon: *makes an unambiguous Nazi salute during inauguration speech*
The entire American press:
20.01.2025 21:15
👍 2631
🔁 468
💬 53
📌 16
19.01.2025 03:11
👍 3
🔁 0
💬 0
📌 0