benchmark: find the net terminal gene in a trillion visual tokens of 1p video
benchmark: find the net terminal gene in a trillion visual tokens of 1p video
when you think about it, the plot of BLAME! is a needle in a haystack task
it's quite a drive from austin, i lived in texas for quite a while and never went (we should have stopped on our Marfa trip oops)
National Parks flyer for Mystery Flesh Pit
you've seen this right
Whoops misread reviewer as interviewer. But code reviewers will probably also love the idea of converting type checked production code to shell scripts
a simple solution is to do all your interview coding in bash and put everything in environment variables. They will surely appreciate this
get it twisted, you should be putting $\ /'" in your filenames
Had a great time presenting at the music workshop, thanks @zacknovack.bsky.social and @hermandong.bsky.social for putting it together!
Presenting a poster with some independent work on dynamic neural audio at 3pm at the AI for Music workshop (room 27)! bostromk.net/ASURA
"blah blah blah user noises blah blah"
sloppy automation makes it easy to drift into an anti-automation stance even for folks who care more about utility than principle (although i don't count myself as one)
My hackles go up when people or agents push bad code obfuscated by misleading "polish signifiers" like comments or defused unit tests. I have a feeling that current quality issues like these are leading folks to build negative preconceptions about future systems
This is so important
*freeks
grindset
low level improvements to information capacity of attention are needed to make this possible. Context rot currently makes icl useless for tasks over a certain level of complexity, much lower than what can fit in context by token volume
lol @ GFYPO
Over on the other app @jessyjli.bsky.social pointed out some countervailing results in arxiv.org/abs/2501.00273
and arxiv.org/abs/2407.00211, using structural rather than lexical similarity to measure diversity
As our main result, we find that when a token is in a model’s vocabulary—i.e., when its characters are tokenised as a single symbol—the model may assign it up to 17x more probability than if it had been split into two tokens instead
culture and cognition
being defeasible and malleable >>>>>
$440m isn't the end of the world but it's a nice example of people giving an algorithm power and then learning that it's broken slightly too late
Yes, in a world where it's working as intended, but see e.g. www.bbc.com/news/magazin... for an example of when that fails to hold
Again, not "meaningful work" per se but definitely decisions with material consequences
High-frequency trading is another established case where people choose to cede decision-making to machines - in this case directly in service of aforementioned "market forces"
what the hell they're cracked
fickle gill strikes again
Love the title
/families of environments
Pretraining in open-ended environments!!