we want functional tools, not to have to offer prayers to the machine spirit to get excel to work
@vcarchidi
Defense analyst. Tech policy. Have a double life in CogSci/Philosophy of Mind and will post about it. (It's confusing. Just go with it.) https://philpeople.org/profiles/vincent-carchidi All opinions entirely my own.
we want functional tools, not to have to offer prayers to the machine spirit to get excel to work
I *do not know* if large language models are conscious, but if you criticize Claude one more time, I *will* defend his honor.
Incredible how uncanny bot posts are in the wild
I guess FWIW, Anthropic has been known to be hawk-ish for a long while, and many Anthropic fans are well aware of this. Probably some cognitive dissonance going on.
Looking at a lot of people here who seem to think there's an incredible moral difference between generating targets and only having some of those targets selected.
Yeah, absolutely. I don't doubt it's being used for a number of things (even if DoD is currently exaggerating how impactful it is because they're throwing a tantrum).
But as far as this is a moral issue, I think it's an issue for Amodei and those who built shrines to Amodei.
(Bait in the sense of devoting energy to AI for targeting, while the actual war of choice continues, not in the sense that it's not true.)
That said, i think @kevinbaker.bsky.social might be right that the AI for target acquisition stuff is bait, and some of us (me included) took it.
I highly doubt the target bank built before the conflict made excessive use of Claude (i.e. didn't attempt to verify them).
I do have concerns about its use during the conflict.
In either case, there's no way to go from use of Claude -> the school strike, without further info.
The WP report doesn't make that connection, IIRC. They note Claude was used in service of:
- Building a target bank *before* the conflict (nomination list).
- Used in real time *during* the conflict to acquire new targets.
FWIW, there's no substantive connection based on public info between the use of Claude and the elementary school strike in Iran, unless I missed some news (possible).
Surprised we're still talking about this guy. Still feel this way about him.
I'm just frustrated by the endless talk about "competition" in technology, usually with a reference to China, where the author just assumes a certain set of values about work and technology under the guise of "winning" that competition.
Yeah, I think the framing you describe is one I basically resisted for a few years, to try and view it purely through the technology's capabilities and so forth.
But that approach is basically incoherent. There is no way to detach an evaluation of a technology from its applications.
It's more noticeable in some areas than others. Policy writing I think is the area where the style, tone, phraseology, and structure is just extremely uniform. (A good and a bad thing tbh.)
I actually 100% believe my writing across different venues improved significantly as a result of LLMs exposing how much repetition there is in a given field. I became much more conscious of it.
And I think, going off all that, the end-user who engages in 'tinkering,' basically seeing what kinds of side projects they can get up to, are good for the end-user. But this does not really translate into the kinds of interestingly novel applications that raise the bar for human life.
I suspect we have gotten too caught up in debates about novelty - novelty which the end-user cannot anecdotally gauge (because the training set is too massive), and novelty which appears to be of a kind that interpolates within humanity's knowledge enclosure.
But I think, so far, the practical upshot of this is almost entirely the acceleration of tasks that (a) could already be done, but more slowly; or (b) could already be done, but the user did not have time to do.
But mere acceleration of tasks is not what raises the standard of civilization.
So, there is a dynamic where the model possesses humanity's knowledge enclosure, has constraints sufficient to engage in a kind of partial automation of, e.g. code generation, and the human end-user is essentially given the burden of leveraging this system in a way that is *interestingly different.*
In this way, constraints enable capabilities. This is clearly a net gain.
But the gain is thoroughly embedded in humanity's knowledge enclosure. The human end-user cannot fathom this, and the novelty of the outputs is likely exaggerated in their eyes. It's practically impossible not to be.
It's interesting. Claude Code, in one sense, is moving in the right direction by essentially imposing constraints on the base LLM's output such that what the end-user receives is far more useful as part of a workflow than what they otherwise might get from interacting with it directly.
Though I wonder - and this is why the focus of the above is on safety-critical applications - if we all intuitively have trouble distinguishing between the novelty of personal use-cases with assistant-like LLMs and the society-changing novelty of past transformative technologies.
This is on my mind a lot, and I don't think the flip-side of what I'm saying can be dismissed: there really is a way to glean some kind of productivity gains or otherwise personal gains by using LLMs in this post-hoc fashion (basically figuring it out as one goes).
What I mean:
bsky.app/profile/vcar...
Oh no, not at all. Sometimes I say things not realizing there's a lot of baggage implied.
What I mean is that the efforts are sort of torn between R&D and product development, which is natural, in a way that departs from earlier goals of the field.
Which *does* lead to effective products FWIW.
Could you elaborate? I'm not sure what you mean here
(I view the scaffolding that goes into something like Claude Code as an example of AI being successful by essentially lowering standards for what it is meant to achieve. Controversial maybe, but I see this as sort of a sacrifice of what was initially envisioned, even though vertu effective.)
a form of brittle, narrow AI. I think this is a bigger problem than is appreciated atm.
I'm not looking for a God in a box so much as a system that can actually allow us to do new things, with assurances on its outputs, and a meaningful ability to intervene on it during safety critical operation.
On the general point, I completely agree. I don't think a general system could be validated like that. I do think uncertainty quantification shouldn't be slept on, but the goal is to use neural nets in a way that doesn't completely trade away the new capabilities they offer by turning them into