I assume not
Is that gif it?
Heading which direction?
All the while I was thinking of a cheetah thread with you back in 2000 or 2001 about real parsers and BNF
I've been leaning into quickcheck property tests and even formal executable specs and compliance tests with it. In 2hrs night I had it generate a full spec and ~400 tests for a lang embedded in a tool I'm working on github.com/tavisrudd/ii.... So damn cheap and fast to go all out on verification
Note the ;)
I need to be more careful about which ghosts I summon github.com/tavisrudd/ii...
Here's some quickcheck property tests "you" inspired github.com/tavisrudd/ii...
It can even make decent category theory quips
The reviews were very useful. Some issues I was already painfully aware of but the framing and thinking helped Claude address them cleanly. 14 issues fixed and 196 new tests added today
Yep, here's the full set of prompts. Mine was very brief and Opus did the rest gist.github.com/tavisrudd/fe...
Seen any that ban human commits?
"As I'd tell my students: if you find yourself writing an evaluator for a tree-structured AST with variable binding, conditionals, iteration, and a module system β you've built a programming language. Own it."
@shriram.bsky.social your "ghost" wrote a review github.com/tavisrudd/ii.... Feel on point?
another ralph loop just retroactively wrote the PRDs & user stories for the insane amount of work it completed and level of complexity it glided through during the port: github.com/tavisrudd/ii...
It changed its mind after having done the work and concluded that was overblown. I disagreed with it completely.
Finally
Yeah! The prelude to this Haskell story is the rust version was also 100% vibe coded but with me in the loop reviewing and course correcting heavily in the early days per Opus 4.5.
Especially after it called out "lens ergonomics" as a high risk item likely tto blow out estimates in its pre coding risk analysis @kmett.ai github.com/tavisrudd/ii...
@ghuntley.com thanks for the inspiration. I'll be writing this up once I have Claude credits available again
Reading its planning and handoff notes between loops (see notes/ and docs/dev/) it's very clear this is no statistical parrot.
This 15hrs from METR horizons is a massive understatement of current reality x.com/waitbutwhy/s...
And to my amazement once Claude was done i discovered it had thrown in custom Haskell implementations of Handlebars, json schema, and jmespath because I had insisted it behave identically with byte for byte identical output to rust: github.com/tavisrudd/ii...
Rust (30kloc plus 20k tests) to Haskell in a single day Ralph loop anyone? github.com/tavisrudd/ii...
During the planning I prompted it to channel its inner Steve McConnell and during review its Yaron Minksy and Kmett. That helped. Next port I'll try channeling you
The other ports were from a 1700 line of Cython pubsub message bus (because I thought concurrency might be hard for it) to 1) asyncio python, 2) typescript, and 3) Zig (because I thought it was obscure enough to be a challenge). It nailed all three of them with complete test coverage in 40 minutes.
If you look in the notes/ and docs/dev/ folders you can see its planning, ADRs, and memory files as it worked. lesson for me out of this was that test oracles and rigourous specifications are the most important things. it can handle the architecture