Enderβs Game but the players are all Claude
Enderβs Game but the players are all Claude
Respectfully I think you may be doing what the kids refer to as βcrashing outβ
I do, yes.
this is true and LLMs used to only produce interesting output in the top left corner but nowadays they are spending a lot more time in the bottom right corner
I don't hate you; I just think you might not know what the person you were replying to meant exactly by "refactoring". I think they meant something much more complex than what you inferred.
Or at loving
Yeah this is really more like what Iβm saying. In many cases itβs not that itβs dangerous, just that itβs wrong in ways that can be very concretely counterproductive. Although depending how lost in the sauce you get it can also be dangerous.
I think there is a good debate in this thread and I respect @sciortino.bsky.socialβs position (although I believe I am right and he is wrong).
I think this is largely true for small numbers, with a few bizarre exceptions (9.8-9.11), but for large enough numbers (which is a stand-in for arbitrary tasks of long enough duration) it canβt tell the difference, and whatβs more important to it than the guy being right is the guy being confident.
No, not really. For large enough numbers, I donβt think the reward function knows the difference between a story about a guy who confidently outputs the wrong answer and a story about a guy who confidently outputs the right answer.
The trouble is you donβt have such a verifier on hand for every arbitrary task. If you did, you could probably solve your problem without the LLM in the first place. But there are many tasks (SWE chiefly among them) where such verifiers are fairly easy to come by.
The LLM is indifferent between outputting the correct product of two large numbers and outputting the incorrect product because in the fictional story it outputs about the guy who does a good job, the fictional product it comes up with is correct either way.
But like I was saying in the forked thread, if I put that into a system with some kind of verifier loop, then I have something with the goal of exiting the verifier loop.
Iβd say modern LLMs have the goal of producing a sequence of tokens that yields a high reward without veering too far from the pre-training distribution. Practically that means outputting a little story about a guy who does a good job.
and if it wasnβt, the LLM had to go back and try again in a loop until the step was carried out correctly, now I argue weβve got something more goal-pursuey. The scaffolding creates the feedback loop which makes it more like a rational agent and less like a fancy autocomplete (positive affect).
I do think that the picture is complicated by so-called scaffolding, and this is a huge and underrated part of the success of things like Claude Code (and Iβm sure what youβre working on as well!). If at each step in the multiplication there was a check that the step had been carried out correctly,
Of course no one cares about multiplying large integers per se, but people do care about many tasks that I would argue are analogous to multiplying large integers in that they involve correctly carrying out a sequence of steps with no mistakes.
In particular, if the reason it canβt multiply is that it just isnβt smart enough then at some point making it smarter should lead to it gaining the ability to multiply. But if the reason is it just doesnβt want to then you need some other strategy; you need to make it want to.
Iβm not really concerned with the nature of humans, squirrels, or pet cats. Iβm just talking about LLMs. I think it matters practically because I think it has implications about what you can expect them to be able to reliably do, and what βscalingβ as opposed to other strategies can help with.
Itβs true that many IDEs have a βRefactorβ menu much like a Word Processor has an βEditβ menu and the Refactor menu is a list of tools for refactoring analogous to Cut, Paste, Find And Replace etc, but that wouldnβt make editing writ large a βvery low barβ
I wonder what this person thinks refactoring is
I can see that you don't understand what I'm saying and I think that's okay.
I think if you model your 7yo as an agent who optimizes in the rational pursuit of multiplying large numbers you will be led to make some bad predictions about what your 7yo actually does
correct, I don't think your 7yo's goal, in the sense of being an objective which your 7yo rationally pursues, is to multiply large numbers.
you're not going to like this answer but, for example, multiplying large integers. You're going to say, oh it's just bad at that. But it's good at all the parts of it. If you can multiply small integers you can multiply large integers and it can multiply small integers. If it wanted to it would.
and its failures to accomplish T will seem quite baffling given what you understand about its ability on tasks X, Y, Z and the assumption that it wants to accomplish T, but will appear quite normal and expected if you abandon the notion that it gives a damn one way or the other about T.
The difference is that an agent which wants to do a task T which is accomplished by correctly carrying out subtasks X, Y, Z, each if which subtasks it appears to be βableβ to carry out competently, will always accomplish T. Whereas a random bumbler will only sometimes or never accomplish T
This is precisely where we disagree, yes. I think you are the one who is confusing βIt doesn't want to do X" with "It's very bad at X" or more precisely βIt rarely succeeds at X.β I do think the distinction has practical and pragmatic consequences.
Yeah exactly
Youβll get some output like βhello there I am a paperclip maximizerβ but it wonβt actually maximize paperclips.