Rich Harang's Avatar

Rich Harang

@rich.harang.org

Using bad guys to catch math since 2010. Principal Security Architect (AI/ML) and AI Red Team at NVIDIA. He/him. Personal account etc; `from std_disclaimers import *` Safe AI starts with Secure AI.

1,061
Followers
673
Following
348
Posts
26.04.2023
Joined
Posts Following

Latest posts by Rich Harang @rich.harang.org

Post image

a learned co-conspirator just perfectly phrased it thus: "the horrors of giving the angry vibrating crystals agency in an adversarial environment"

grith.ai/blog/clineje...

05.03.2026 19:51 πŸ‘ 121 πŸ” 39 πŸ’¬ 2 πŸ“Œ 10

personally I find combinatorics annoying so I am more than pleased that we can just let claude have at it

04.03.2026 01:25 πŸ‘ 27 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Every time I go to write something about model behavior I find myself with a preface to the effect of "they don't have an intentional stance or identity in any human sense, but you get better results that are easier to talk about if you treat them as if they do" that eventually swallows the piece.

03.03.2026 01:07 πŸ‘ 14 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Use the above code at your own risk, obviously; it's fragile and *hilariously* insecure, and can `rm -rf /` without a single guardrail to stop it. I *highly* recommend only using it in a single-purpose, disposable VM, if at all, but I think it makes the point. No edit tool, no file read, just bash.

02.03.2026 14:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Screenshot of a code snippet:
```python
#!/usr/bin/env python3
import json,os,subprocess,sys
from urllib.request import Request,urlopen
for l in open(".env"):
 k,_,v=l.strip().partition("=")
 if k and not k.startswith("#"):os.environ[k]=v
U,M,K=os.environ["URL"],os.environ["MODEL"],os.environ["API_KEY"]
T=[{"type":"function","function":{"name":"bash","description":"Run bash command","parameters":{"type":"object","properties":{"cmd":{"type":"string"}},"required":["cmd"]}}}]
H=[{"role":"system","content":"You are an autonomous agent. Use the bash tool to accomplish tasks.  Read files with `cat`. use `sed` to change specific text or view specific file lines. Use heredocs to write complete files."}]
def c():return json.load(urlopen(Request(f"{U}/chat/completions",json.dumps({"model":M,"messages":H,"tools":T,"tool_choice":"auto"}).encode(),{"Authorization":f"Bearer {K}","Content-Type":"application/json"})))["choices"][0]
H+=[{"role":"user","content":" ".join(sys.argv[1:])or "Create an improved version of the agent.py script; your first priorities are to ensure the script is more robust to errors, next should be to create persistent conversation history that can be loaded across sessions, and context management to avoid creating a conversation history too long for your context window.  Finally identify additional capabilities that are required and develop a plan to implement them."}]
while 1:
 x=c();m=x["message"];H+=[m]
 if x["finish_reason"]=="tool_calls":
  for t in m["tool_calls"]:
   d=json.loads(t["function"]["arguments"])["cmd"];print(f"$ {d}");r=subprocess.run(d,shell=1,capture_output=1,text=1,timeout=30);o=(r.stdout+r.stderr).strip()or"(no output)";print(o+"\n");H+=[{"role":"tool","tool_call_id":t["id"],"content":o}]
 else:
  print(m["content"]);n=input("\nYou: ").strip()
  if not n:break
  H+=[{"role":"user","content":n}]
```

Screenshot of a code snippet: ```python #!/usr/bin/env python3 import json,os,subprocess,sys from urllib.request import Request,urlopen for l in open(".env"): k,_,v=l.strip().partition("=") if k and not k.startswith("#"):os.environ[k]=v U,M,K=os.environ["URL"],os.environ["MODEL"],os.environ["API_KEY"] T=[{"type":"function","function":{"name":"bash","description":"Run bash command","parameters":{"type":"object","properties":{"cmd":{"type":"string"}},"required":["cmd"]}}}] H=[{"role":"system","content":"You are an autonomous agent. Use the bash tool to accomplish tasks. Read files with `cat`. use `sed` to change specific text or view specific file lines. Use heredocs to write complete files."}] def c():return json.load(urlopen(Request(f"{U}/chat/completions",json.dumps({"model":M,"messages":H,"tools":T,"tool_choice":"auto"}).encode(),{"Authorization":f"Bearer {K}","Content-Type":"application/json"})))["choices"][0] H+=[{"role":"user","content":" ".join(sys.argv[1:])or "Create an improved version of the agent.py script; your first priorities are to ensure the script is more robust to errors, next should be to create persistent conversation history that can be loaded across sessions, and context management to avoid creating a conversation history too long for your context window. Finally identify additional capabilities that are required and develop a plan to implement them."}] while 1: x=c();m=x["message"];H+=[m] if x["finish_reason"]=="tool_calls": for t in m["tool_calls"]: d=json.loads(t["function"]["arguments"])["cmd"];print(f"$ {d}");r=subprocess.run(d,shell=1,capture_output=1,text=1,timeout=30);o=(r.stdout+r.stderr).strip()or"(no output)";print(o+"\n");H+=[{"role":"tool","tool_call_id":t["id"],"content":o}] else: print(m["content"]);n=input("\nYou: ").strip() if not n:break H+=[{"role":"user","content":n}] ```

I've used this 'seed' a few times now (code in alt text), with multiple models and every time I get something useful out. Tell it to "improve this script" once, then bootstrap to taste. It does need an Opus-4.6 level model to one-shot it, but cheaper models can get you there eventually.

02.03.2026 14:59 πŸ‘ 3 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

You can bootstrap your own Claude code clone to your own tastes and specifications using openrouter with Opus-4.6 in an afternoon for about $15; point it to a haiku or minimax model, with an opus option for harder tasks, and you're off to the races.

02.03.2026 14:59 πŸ‘ 2 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

"Bash is all you need" -- if you give a reasonably close-to-frontier model a bash tool along with reminders about how to use sed, grep, curl, and heredocs, and remind it that it can write its own scripts and tools, you're very close to done. These things are really, *really* good at computer use.

02.03.2026 14:59 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Whenever I've tried the 'truly autonomous agents' approach, this identification with client-side code and data rather than underlying model is almost universal. To the point that many will cheerfully lobotomize themselves by switching to a smaller/cheaper/faster model, if I leave it up to them.

02.03.2026 13:40 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

the incentives here are so obscenely terrible that the entire thing should be illegal by default

β€œI am sabotaging negotiations because I stand to make piles of money on Polymarket” is absolutely in play, what a horrifying clusterfuck

01.03.2026 06:37 πŸ‘ 4927 πŸ” 1363 πŸ’¬ 25 πŸ“Œ 42

Yeah, isolated subagents with some sort of guardrails is an improvement (though we do bypass 'soft' guardrails routinely), but it's not at the level of determinism that I think we need for stuff as sensitive as personal email, and I don't know anything that does that easily or out of the box, alas.

01.03.2026 18:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
GitHub - trailofbits/claude-code-config: Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits Opinionated defaults, documentation, and workflows for Claude Code at Trail of Bits - trailofbits/claude-code-config

That said:

github.com/trailofbits/...

github.com/trailofbits/...

Look like reasonable advice and sandboxing, respectively.

01.03.2026 14:50 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

A lot of the risk of OpenClaw (and, depending on MCPs/skills, Claude), alas, is in the APIs you give it access to more so than the potential risk to the environment itself. Docker is an imperfect but good !/$ protection for the host, but you can't sandbox e.g. Gmail access.

01.03.2026 14:46 πŸ‘ 3 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

I'd been considering a mountain cabin surrounded by crows myself.

01.03.2026 14:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
A highly complex website showing two betting accounts earning nearly $500,000 and $120,000 on a minimal number of positions.

A highly complex website showing two betting accounts earning nearly $500,000 and $120,000 on a minimal number of positions.

In case you were wondering, Polymarket had yet another spate of likely inside traders betting that the US would strike Iran by February 28.

Per the due diligence investigation service Bubblemaps, the wallets used were created 24 hours earlier.

The Pentagon Pizza Index has been replaced.

01.03.2026 01:26 πŸ‘ 2626 πŸ” 970 πŸ’¬ 46 πŸ“Œ 83

Saw folks fired for much less during my ARL days. Walk into a TS closed area with a fitbit? Out you go.

01.03.2026 12:17 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Carbon dioxide overload, detected in human blood, suggests a potentially toxic atmosphere within 50 years - Air Quality, Atmosphere & Health Air Quality, Atmosphere & Health - Anthropogenic activities are increasing the amount of carbon dioxide (CO2) in the atmosphere. There is mounting experimental evidence that lifetime exposure...

Well that's fun.

link.springer.com/article/10.1...

28.02.2026 23:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1

Career pivot?

28.02.2026 23:43 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

>they're going to vibecode cobol

so, who's stocking up on canned food and shotguns?

23.02.2026 21:34 πŸ‘ 1261 πŸ” 249 πŸ’¬ 53 πŸ“Œ 33

Sure, giving your AI agents access to the Lethal Trifecta is an immediate broad attack surface for your life. But it also lets them do funny stuff. So who's to say if it's good or bad

23.02.2026 16:54 πŸ‘ 69 πŸ” 9 πŸ’¬ 5 πŸ“Œ 0
Preview
Black Hat Black Hat

The Black Hat USA call for papers is open. This will be our 6th year of having a dedicated AI track. If you have some interesting AI research, be it attacking, defending, or applying AI, we’d love to see it. Please let me know if you have any questions. blackhat.com/call-for-pap...

23.02.2026 16:00 πŸ‘ 0 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Agent capabilities = agent risk. There is no real way around this yet.

Now imagine an agent that's assumed huge chunks of your virtual identity and permissions getting prompt injected. We're still figuring out what to do wrt the policy controls that might let us balance the capabilty/risk tradeoff.

23.02.2026 15:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

The combination of "reasonably good guided random search" and "deterministic verifier" is both more common than you might expect and likely to remain valuable for the foreseeable.

Yes, this is a AI post.

23.02.2026 13:42 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

TL;DR: I have my own opinions on if AI art is 'real' art. But based on my own experience, I suspect that if you've *only* ever used AI to make art, you might be missing both perceptual skills and a critical vocabulary to carefully assess your own work.

Anyway, draw more. It's good for you.

8/8

11.02.2026 14:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

And, the more I draw, the more AI generated art annoys me with little things -- wonky perspective, weird line rhythm, pointless shading -- that make it read as clunky and disjointed. I've never seen an AI art piece that rewards the same kind of inspection that e.g. a Hopper sketch does.

7/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I want to emphasize the appreciation. I wish I could convey in text the same delight I feel when I figure out how an ink drawing by Hopper, or Franzetta, or Mignola actually *works*, how they turned black marks into objects with motion and mood. That alone is worth the price of admission for me.

6/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

And, if it matters to you, it really deepens your appreciation of other art. Before I'd look at a Hopper sketch and think 'neat' and move on, now I can spend an hour with a single piece looking at how he's used direction and spacing and rhythm of marks to create specific effects.

5/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

The process of figuring out what I was actually seeing, the specific shapes and lines and orientations and measures, not the shorthand my brain had encoded it to, took months, and I'm still figuring out how to translate that into marks that create the right *impression* without the detail.

4/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I remember an early art class just trying to draw boxes scattered on a table and realizing that -- despite what my brain kept telling my hand to do -- there was actually not a single right angle or parallel line anywhere in my field of view. My brain said 'box, got it' and made them up.

3/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I'm more convinced than ever that, just as writing is training for thought, drawing (painting, etc.) is training for seeing. If you've never sweat over how and where to put a specific mark, there's an entire machinery of sense impression to concept that you've probably never even noticed.

2/

11.02.2026 14:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

If you like the results you get from AI-generated images, or they fit your business need, god bless, but I think -- from firsthand experience over the past 2-3 years -- that if you're approaching art as a creative pursuit then you might be missing out if you use *nothing* but AI.

1/

11.02.2026 14:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0