Anson's Avatar

Anson

@ansonbiggs.com

Writing software to make a safer, autonomous future. Formerly Blue Origin and ULA. Send me your blog ansonbiggs.com

154
Followers
476
Following
67
Posts
22.12.2023
Joined
Posts Following

Latest posts by Anson @ansonbiggs.com

Actually seeing a ton of value in deep tech. Domain experts are able to focus on their domain, and AI handles the secondary tasks that they aren’t good at like ci/cd, cmake, etc.

09.03.2026 19:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
announcing our €3,8M seed round and more on what's next

today, we're announcing our €3,8M ($4.5M) seed financing round, led by byFounders with participation from Bain Capital Crypto, Antler, Thomas Dohmke (former CEO of GitHub), Avery Pennarun (CEO of Tailscale) among other incredible angels.

read more on what's next: blog.tangled.org/seed

02.03.2026 09:51 πŸ‘ 809 πŸ” 146 πŸ’¬ 54 πŸ“Œ 68
Post image

time traveler from 12 months from now just sent me this

27.02.2026 21:25 πŸ‘ 1618 πŸ” 205 πŸ’¬ 66 πŸ“Œ 34
Hoard things you know how to do - Agentic Engineering Patterns - Simon Willison's Weblog

Today's chapter of Agentic Engineering Patterns is some good general career advice which happens to also help when working with coding agents: Hoard things you know how to do simonwillison.net/guides/agent...

26.02.2026 21:14 πŸ‘ 86 πŸ” 5 πŸ’¬ 5 πŸ“Œ 1

watching firefox slowly die is so sad

26.02.2026 06:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

the classic dev blog issue. It’s more fun to make and tinker on the blog than it is to write it

24.02.2026 20:00 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
graph of programming languages by percent of programmers identifying as LGBTQ.  Rust is way in the lead at a whopping 55% with the next highest being zig at 30%, followed by a smooth curve with haskell being at 28% and typescript being at 25%.  more languages are listed but it is mostly uninteresting.

graph of programming languages by percent of programmers identifying as LGBTQ. Rust is way in the lead at a whopping 55% with the next highest being zig at 30%, followed by a smooth curve with haskell being at 28% and typescript being at 25%. more languages are listed but it is mostly uninteresting.

i love this graph

24.02.2026 01:53 πŸ‘ 193 πŸ” 41 πŸ’¬ 12 πŸ“Œ 29

torrents will download at impressive speeds until of course you get to the last megabyte at which point they slow to 1 byte per second as they scour the entire world for the rare file fragments lost to the sands of time

23.02.2026 09:14 πŸ‘ 66 πŸ” 9 πŸ’¬ 4 πŸ“Œ 1

I regularly see people wondering how it's possible that there are so many musicians and writers and film makers and artists from a tiny nation like Iceland.

And the answer is really simple: State funding for art education and artists. I literally get a salary from the government to write books.

18.02.2026 14:23 πŸ‘ 20817 πŸ” 5583 πŸ’¬ 217 πŸ“Œ 375
Preview
You're probably using Agent Skills wrong The entire ecosystem around Claude Code is pretty confusing, the naming conventions are a mess and the pace of change is beyond any production tool I've seen. However Skills are probably the most misused. I see it at work at ton but a paper just came up on Hacker News: SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse TasksAgent Skills are structured packages of procedural knowledge that augment LLM agents at inference time. Despite rapid adoption, there is no standard way to measure whether they actually help. We present SkillsBench, a benchmark of 86 tasks across 11 domains paired with curated Skills and deterministic verifiers. Each task is evaluated under three conditions: no Skills, curated Skills, and self-generated Skills. We test 7 agent-model configurations over 7,308 trajectories. Curated Skills raise average pass rate by 16.2 percentage points(pp), but effects vary widely by domain (+4.5pp for Software Engineering to +51.9pp for Healthcare) and 16 of 84 tasks show negative deltas. Self-generated Skills provide no benefit on average, showing that models cannot reliably author the procedural knowledge they benefit from consuming. Focused Skills with 2--3 modules outperform comprehensive documentation, and smaller models with Skills can match larger models without them.arXiv.orgXiangyi Li The HN title is editorialized for some reason _"Study: Self-generated Agent Skills are useless"_ , but it immediately grabbed me since I get massive value from Skills written by Agents, but I also consistently see them misused by my peers. The concept is great, I've been looking at benchmarking specific parts of the Agentic ecosystem myself so this was highly relevant to me. Overall the paper is decent but one bullet invalidates the whole thing: > **Self-Generated Skills** : No Skills provided, but the agent is prompted to generate relevant procedural knowledge before solving the task. This isolates the impact of LLMs’ latent domain knowledge. So all they are doing is taking a problem that a model can't solve well on its own, and asking it to write about the task before attempting it. They just reinvented thinking blocks but worse! ### The Skill Anti-Pattern What they did is a very common mistake that I see constantly. My Agent it bad at this thing so I ask the Agent to write a skill on this thing. I'll reiterate this is identical to thinking blocks. In order for your Agent to create something worthwhile you have to make sure they can see the gaps. I see this as the classic CS intro where you ask someone to write out the steps to make a PB&J, you don't really understand what makes the problem hard until you've struggled through solving it. This directly leads into the largest Faux Pas of the AI era, just asking a LLM someone elses question verbatim, and pasting the LLMs answer as your response. If I ask you how you did something cool with an Agent, and you just on the fly have a fresh Agent build me a SKILL.md on my question, _I will kill you._ ## What are Skills Before getting into proper usage, I just want to cover what skills are. As a primitive they are just markdown files that have some metadata at the top to help Agents/Tools know when to use them, and then the rest of the document is the skill. Each skill has its own folder so it can no only teach your Agent how to do something but also give it better tools. .claude/skills/ └── monitor-gitlab-ci/ β”œβ”€β”€ SKILL.md # The file metioned above β”œβ”€β”€ monitor_ci.sh # Complicated command └── references/ # Additional references β”œβ”€β”€ api_commands.md β”œβ”€β”€ log_analysis.md └── troubleshooting.md Above is a Skill I used a ton to let older versions of Claude work on my GitLab CI. It's a folder with a simple markdown Skill that just explained the setup and that the Agent needs to watch the CI until either a job fails or everything passes, a simple CLI to prevent the Agent from writing a script, and additional references for edge cases. ## Skills for Context Agents are completely stateless meaning that every new conversation is like meeting the model for the first time, it has no idea what your project is or what you were working on 10 minutes ago. CLAUDE.md does a lot to fix this, but for a large enough project it can't contain everything. If I open up a monorepo and tell Claude to run a SIL test then it is going to have to run around to figure out how to do that. It has to figure out what language the project is in, then look for common test patterns for that language, its going to see a complicated Docker Compose setup, its going to see that the containers need x86 but we're running on a Mac, then its going to look for CI, etc. This can all be solved by writing Skills for common, but not universal patterns. A good habit is having Claude explain bespoke parts of your project as a Skill while you implement. ## Skills for Hard Problems Claude can solve some really hard problems, but it might take $500 in tokens and you might have to yell at it for reward hacking a few times. Almost any time I have to intervene on a problem, once the Agent it unstuck I ask it what the gap was that kept it from figuring it out on its own. Sometimes it something silly, but sometimes it is something genuinely insightful and I have Claude make a Skill to fill the gap. ## Conclusion I edited the original benchmark to do Skills my way and the results were incredible, the Agents nailed the test with proper Skills. I don't have the money to spend on fully validating this result but the first pass was good enough for me to be happy. I assume this changes the benchmark enough that it was too hard to do properly so the Authors opted to do it wrong. Make sure when you create a Skill the Agent knows something that the base model doesn't. This can even happen during the creating of the Skill, but the Agent must have the oppurtunity to understand the issue it is solving before it can solve it better than a base model. Happy Hacking.

But thankfully for you I'm smart enough to show you how its done

17.02.2026 04:17 πŸ‘ 0 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

ctrl-s to stash is also a super power

11.02.2026 20:28 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Agent teams are pretty cool but it’s hilarious how often the orchestration Claude gets mad that things are taking so long

09.02.2026 20:30 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

age verification is functionally a huge security hazard for _all_ internet users but the government will call you a danger to children if you point this out

09.02.2026 15:37 πŸ‘ 243 πŸ” 97 πŸ’¬ 2 πŸ“Œ 0

> you did a great job on this :)
> /reset

05.02.2026 06:36 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Scaling With Agents With Claude Code, you're a CTO now - not an artisanal coder. How I'm breaking down work into agent-friendly chunks and using CI/CD pipelines to unlock infinite parallel compute.

How I'm using AI Agents to build big

22.01.2026 05:24 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

they fire in CC pretty often for me

20.01.2026 14:54 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
The Artemis II SLS rocket just outside of the VAB. The rocket is partially obscured by the Mobile launcher, with the left SRB mostly blocked. The VAB High Bay door is still open, showing High Bay 3 where SLS was stacked.

The Artemis II SLS rocket just outside of the VAB. The rocket is partially obscured by the Mobile launcher, with the left SRB mostly blocked. The VAB High Bay door is still open, showing High Bay 3 where SLS was stacked.

There have been five years in our history where a rocket that would carry humans to the Moon stood on a launch pad.

1968, 1969, 1970, 1971, 1972

But now, there are six.

17.01.2026 23:32 πŸ‘ 93 πŸ” 22 πŸ’¬ 1 πŸ“Œ 1
screenshot of claude code where it says that it just finished updating firmware and tells the human to push a button

screenshot of claude code where it says that it just finished updating firmware and tells the human to push a button

Crazy how fast things changed and now I'm a dumbo taking orders from a robot

#ClaudeCode #CircuitPython

30.12.2025 05:09 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Definitely feel this too I’ve been full time vibe coding at work for about 6 months now and probably the highlight of this quarter is when I reviewed some code on the plane without internet or Claude

14.12.2025 01:04 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Hochul a 'no' on Mamdani's free bus plan; 'yes' on statewide universal childcare The price tag to expand a universal childcare program statewide is $15B.

β€œI know you can’t afford things but we also cannot afford things” will be, I predict, a losing message. The world’s first trillionaire is on the horizon. β€œWe can’t afford transportation, healthcare, food and shelter” will no longer cut it for the β€œprogressive” party.

09.11.2025 02:14 πŸ‘ 2882 πŸ” 516 πŸ’¬ 74 πŸ“Œ 45

spelling it Γ¦ffect so I don't have to remember which it is

06.11.2025 23:28 πŸ‘ 118 πŸ” 12 πŸ’¬ 7 πŸ“Œ 0
Comic. PERSON 1 with white hat: How tall are you? PERSON 2: 5ft 24cm [caption] When switching to metric, make the process easier by doing it in steps.

Comic. PERSON 1 with white hat: How tall are you? PERSON 2: 5ft 24cm [caption] When switching to metric, make the process easier by doing it in steps.

Metric Tip

xkcd.com/3164/

06.11.2025 23:33 πŸ‘ 4393 πŸ” 553 πŸ’¬ 66 πŸ“Œ 32
BadFriends - Bluesky Filter Analytics Find which Bluesky accounts you follow are constantly posting content that triggers your mutes and filters. Analyze your follows and clean up your timeline.

I realized that most of the time I was making filters it was for the same people. Vibe coded up an app to show me the people triggering my filters the most so I can unfollow them. I thing #atproto is growing on me

badfriends.ansonbiggs.com

14.10.2025 02:41 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
Using Code Agents Effectively Building with AI agents is far from simple, here is everything I've found to get great results.
06.10.2025 03:14 πŸ‘ 1 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

it me

01.10.2025 20:25 πŸ‘ 404 πŸ” 48 πŸ’¬ 2 πŸ“Œ 1

Well don't I feel stupid

26.09.2025 23:02 πŸ‘ 26031 πŸ” 6827 πŸ’¬ 289 πŸ“Œ 143
Answer from an Oxide FAQ: "Can I use an LLM to help write my materials?"

Answer from an Oxide FAQ: "Can I use an LLM to help write my materials?"

New @oxide.computer application FAQ just dropped

22.09.2025 20:05 πŸ‘ 191 πŸ” 26 πŸ’¬ 7 πŸ“Œ 0

Unfortunately, Bluesky is unavailable in Mississippi right now, due to a new state law that requires age verification for all users.

While intended for child safety, we think this law poses broader challenges & creates significant barriers that limit free speech & harm smaller platforms like ours.

22.08.2025 19:54 πŸ‘ 56424 πŸ” 14142 πŸ’¬ 2499 πŸ“Œ 2767

thinking about pivoting to a career in counting letters in the word strawberry, for the job security

08.08.2025 00:08 πŸ‘ 262 πŸ” 24 πŸ’¬ 5 πŸ“Œ 2

self hosting is so much fun until the thing you haven’t thought about for two years randomly dies

01.08.2025 03:38 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0