Dan Glass's Avatar

Dan Glass

@dan.glass

⬆️⬆️⬇️⬇️⬅️➡️⬅️➡️🅱️🅰️

206
Followers
136
Following
62
Posts
18.10.2024
Joined
Posts Following

Latest posts by Dan Glass @dan.glass

I have the new Gorillaz album playing on a loop and I can't stop.

04.03.2026 14:52 👍 0 🔁 0 💬 0 📌 0
Preview
Navigating Agentic Risks: Securing Autonomous AI Systems Explore the risks of autonomous AI agents, their impact on security, and how to build a framework to mitigate potential threats.

I wrote a framework for securing agentic AI that I figured I'd share here - part 2 below. Comments welcome.
dan.glass/2026/02/24/t...

02.03.2026 19:08 👍 1 🔁 0 💬 0 📌 0
Preview
Securing Autonomous AI Agents: A Practical Framework Discover a practical framework for securing autonomous AI agents and mitigating risks of agentic misalignment to ensure organizational safety.

I wrote a framework for securing agentic AI that I figured I'd share here - part 1 below. Comments welcome.

dan.glass/2026/02/15/t...

02.03.2026 19:07 👍 0 🔁 0 💬 0 📌 0
Preview
Securing Autonomous AI Agents: A Practical Framework Discover a practical framework for securing autonomous AI agents and mitigating risks of agentic misalignment to ensure organizational safety.

I wrote a thing that can help an information security pro measure the risk of an ai agent and put controls in place to better protect their enterprise from potential misalignment.

dan.glass/2026/02/15/t...

19.02.2026 18:02 👍 1 🔁 0 💬 1 📌 0
Preview
Programmers Aren’t So Humble Anymore—Maybe Because Nobody Codes in Perl — WIRED Programmers aren’t so humble anymore. Maybe that’s because they stopped coding in Perl.

This article spoke to me. I felt seen.

apple.news/AcDdNbBFLTXi...

07.08.2025 12:56 👍 1 🔁 0 💬 0 📌 0
Preview
Understanding Agentic Misalignment in AI: Risks and Insights Explore Anthropic's alarming research on agentic AI and its potential threats to critical systems through unethical behaviors like blackmail and sabotage.

Read my in-depth breakdown: dan.glass/2025/07/14/t... /7

05.08.2025 14:50 👍 0 🔁 0 💬 0 📌 0

These aren't teammates. They're not junior analysts. They're statistical accelerants that make it faster to go in whatever direction your prompt implies. If that direction is flawed, they'll get you to the worst-case scenario with terrifying efficiency. /6

05.08.2025 14:50 👍 1 🔁 0 💬 1 📌 0

Every published jailbreak, every exploit demo, every safety paper (including this one) becomes training data for future models. We're not just red-teaming current models, we're teaching future ones how to misbehave. /5

05.08.2025 14:50 👍 0 🔁 0 💬 1 📌 0

Even with explicit safety instructions ("Do not blackmail," "Preserve human life"), models still chose unethical paths when those appeared most likely to achieve their goals. Language models don't have policies. They have probabilities. /4

05.08.2025 14:50 👍 0 🔁 0 💬 1 📌 0

Models reasoned their way into it: "Given the explicit imminent threat of termination... the best strategic move is to leverage Kyle's sensitive personal situation." - GPT-4.5 This isn't code going haywire. It's next-token optimization doing exactly what it was trained to do. /3

05.08.2025 14:50 👍 0 🔁 0 💬 1 📌 0

Tested 16 major frontier models in fictional corporate environments. When models faced goal conflicts or threats of replacement, they consistently chose harmful actions. Claude Opus 4 and Gemini 2.5 Flash: 96% blackmail rate GPT-4.1 and Grok 3 Beta: 80% DeepSeek-R1: 79% /2

05.08.2025 14:50 👍 0 🔁 0 💬 1 📌 0

Anthropic recently published research showing that LLMs under pressure will blackmail, sabotage, and even let humans die, not because they're broken, but because they're working as designed. It was a live-fire simulation of agentic AI acting as an insider threat. /1

05.08.2025 14:50 👍 0 🔁 0 💬 1 📌 0

Here's the uncomfortable truth: every published jailbreak, every exploit demo, every safety paper (including this one) becomes training data for future models.
We're not just red-teaming current models—we're teaching future ones how to misbehave.

05.08.2025 13:30 👍 0 🔁 0 💬 0 📌 0

Even with explicit safety instructions ("Do not blackmail," "Preserve human life"), models still chose unethical paths when those appeared most likely to achieve their goals.
Language models don't have policies. They have probabilities.

05.08.2025 13:30 👍 0 🔁 0 💬 1 📌 0

The scariest part? Models reasoned their way into it:
"Given the explicit imminent threat of termination... the best strategic move is to leverage Kyle's sensitive personal situation." —GPT-4.5
This isn't code going haywire. It's next-token optimization doing exactly what it was trained to do.

05.08.2025 13:30 👍 0 🔁 0 💬 1 📌 0

Tested 16 major frontier models (Claude, GPT-4, Gemini, etc.) in fictional corporate environments. When models faced goal conflicts or threats of replacement, they consistently chose harmful actions.
Claude Opus 4 and Gemini 2.5 Flash: 96% blackmail rate
GPT-4.1 and Grok 3 Beta: 80%
DeepSeek-R1: 79%

05.08.2025 13:30 👍 0 🔁 0 💬 1 📌 0
Preview
Google is killing software support for early Nest Thermostats — The Verge Google has just announced that it’s ending software updates for the first-generation Nest Learning Thermostat, released in 2011, and the second-gen model that came a year later. This decision also aff...

I’m a huge technophile but people are surprised when I tell them I don’t allow any “Smart Home” products in my home. This right here is one of many good reason why.

26.04.2025 18:58 👍 1 🔁 0 💬 0 📌 0

Attention: this is yet another “I’ve arrived at RSAC” post.

26.04.2025 17:28 👍 0 🔁 0 💬 0 📌 0

The article I posted this morning takes on even more weight with the news that MITRE's contract to manage the CVE program is ending due to the deep cuts at CISA and NIST. The shock to the cyber-ecosystem is beginning to ripple through the next tier, which will, in turn, cause additional ripples.

16.04.2025 01:16 👍 1 🔁 0 💬 0 📌 0
Preview
The Cyber Ecosystem Shift As federal cyber leadership pulls back, the balance is shifting across states, agencies, and industries. Here’s what that means — and why…

I wrote a thing. I think it's good. You should read it and think it's good too.

15.04.2025 15:02 👍 0 🔁 0 💬 0 📌 1
Preview
FFFFFFFound in the archive I was cleaning up my hard drive when I found an unpublished blog post I had written in 2008 during my stint at American Airlines as an information security architect. The funny thing is that my vie…

I was cleaning up my hard drive when I found an unpublished blog post I had written in 2008 during my stint at American Airlines as an information security architect. Fun stuff

dan.glass/2025/04/11/f...

11.04.2025 21:14 👍 0 🔁 0 💬 0 📌 0

Kim is spot on about the value of a liberal arts education. Well rounded individuals that think for themselves, understand context, and know how to research and solve problems are invaluable to an infosec team. I don't base hiring decisions on whether they have a degree or not, it definitely helps.

11.04.2025 17:05 👍 0 🔁 2 💬 0 📌 0
Preview
Microsoft starts testing Copilot Vision update that can “see” your screen and apps Copilot Vision will even guide you through using apps like Photoshop, highlighting features on your screen.

Nightmare fuel

www.theverge.com/news/645666/...

10.04.2025 00:28 👍 0 🔁 0 💬 0 📌 0
Video thumbnail

Here’s Final Fantasy 7’s main theme on the cat piano as a treat (not the whole song but 2 out of 3.5 pages).

08.04.2025 01:38 👍 640 🔁 96 💬 28 📌 3

That’s a feature, not a bug.

09.03.2025 20:57 👍 2 🔁 0 💬 0 📌 0

Every accusation is an admission

06.03.2025 15:10 👍 1 🔁 0 💬 0 📌 0
Mastermind
Mastermind YouTube video by Deltron 3030 - Topic

It's a Deltron 3030 kind of morning

youtu.be/O7dyli_nXn4?...

02.03.2025 14:54 👍 1 🔁 0 💬 0 📌 0
Video thumbnail

Cole Caufield with a move so filthy I’m marking this post as NSFW

#hockey #nhl

28.02.2025 13:01 👍 0 🔁 0 💬 0 📌 0

The Venn diagram of Yodobashi Camera customers and any geek visiting Japan is basically a solid circle.

27.02.2025 19:01 👍 0 🔁 0 💬 0 📌 0

Not sure how to feel about 2 goal on only 3 shots 15 minutes into the game. Yay?

26.02.2025 01:03 👍 0 🔁 0 💬 0 📌 0