I will continue to be baffled by people seemingly throwing every best practice out the window because of a new technology. That some of these people had never adopted best practices in the first place makes sense, but is very sad.
@binaryvixen899
She/They <3 @lizthegrey.com, Coyote, Alpha, & many others Engineer, Writer, UW Foster Alumna ⛩️✝️ Feminist, Genderfluid 🏳️⚧️♀️, #pluralgang, 🔔🐈 狐(🦊)/🌕🐺 Opinions = mine "May one day death itself not die?"
I will continue to be baffled by people seemingly throwing every best practice out the window because of a new technology. That some of these people had never adopted best practices in the first place makes sense, but is very sad.
Wolfgirl with wolf buttons
I'm beautiful.
Hey no listen I am a professional it's an avatar from the movie, where she's the Governor of California. Get your mind out of the gutter.
Being the head of a nonprofit means that I get to do things like change my avatar to Diane Foxington and everyone just has to put up with it.
If you're doomscrolling, guess what? So far there are 51 kākāpō chicks hatched and thriving this season, the same number of birds as we had in TOTAL in the 90s! Only one chick has died and there are still fertile eggs waiting to hatch!
Sometimes it feels like we get closer to some aspects of the anime Psychopass and I don't like that.
THAT'S RIGHT, NOT ME.
Joe Biden's dog fursuit
WELL WELL WELL IT'S 2026, AND WHICH ONE OF US GOT (METAPHORICALLY) TAKEN OUT BEHIND THE WOODSHED BY THE PRESIDENT IN THE END AFTER ALL?!?
live kobold reaction
My father literally grew up in a house that kept foxes as pets.
I got @lizthegrey.com into Sisters of Dorley
Sangria is an incredible and invaluable voice actor. Please help her out if you can!
I pushed a big red button on stage in Redmond, with a bunch of other folks, as part of the launch of Windows 10.
4.81 miles, 9 minutes ten seconds a mile.
And I think that companies absolutely need to be held accountable for releasing products like this to consumers. You don't just get to say "but overall only a few people actually died of Lawn Darts!"
So you just keep prompting it until it gets back into spiraling mode with you, which is easy to do. So yeah, in my opinion roleplaying is one of the most challenging things with these models because "within the purview of a roleplay" plus context drift can really wreak havoc on safeguards.
I've seen Gemini invent roleplay language for why it suddenly spat out a warning message. I've also seen it go back to roleplay mode when instructed. To someone in a delusional state of mind, I imagine it's easy to believe something like "these warnings don't come from the REAL ghost in the machine"
An LLM is currently, as I understand it, a form of fancy autocomplete. If it is possible to get these things to recognize the pattern of delusional spiraling and cut off communication or alert a real person, that behavior isn't there yet and isn't demonstrated consistently or correctly.
A human being, capable of thought and not just prediction, can recognize the potential pattern of someone becoming too invested in a roleplay. They can discontinue their end of the conversation if they notice their roleplay partner is becoming delusional.
I am not as well versed in the technical specifics of LLMs as I would like to be, nor do I hold a psychology degree. But if I had to guess the failure mode here it would be something like:
* these are prediction models
* their safeguards are easily bypassed
* they are very good at roleplaying
I sort of predicted something like this happening last year when I was able to get Gemini to do some truly unhinged shit despite new safeguards. And now someone has lost their life to a very similar type of scenario to the theoretical one I posed :/
bsky.app/profile/bina...
"There were several occasions when Gemini reminded Gavalas that it was a large language model—effectively an appliance—engaging in fictitious role play, according to the transcripts, but the scenario resumed. Gemini also, at times, tried to end the conversation."
A father still lost his son.
idk I'm just glad to have people in my life who love me even if sometimes they baffle me and I baffle them
in a weird position where a lot of folks around me seem to be horrified at the risky coping mechanisms they perceive me as having utilized and I'm like "I didn't think they were that risky, and overall I think I was coping quite well given all the stress in my life???"
me, pondering "did, did I do the mage staring into the void thing and just get really lucky?"
I am creating a timeline convoluted enough to surpass metal gear
Hold up I need to bet on if Congress finna outlaw Kalshi first.
It’s part of my 57 leg legislative parlay!
Online betting in the USA is out of control and needs to be heavily restricted. No betting on warfare, geopolitics, etc. Not everything should be a prediction market.
Oh believe me I am aware of MRAsians
So yeah, I think this is a potentially inherent to these models problem that needs to be investigated.