Ben Kirwin's Avatar

Ben Kirwin

@bkirwi

just setting up my twttr

77
Followers
223
Following
15
Posts
05.11.2024
Joined
Posts Following

Latest posts by Ben Kirwin @bkirwi

Preview
gradient.horse Draw a horse, watch it run!

omg everybody go draw a horse this is what the internet was made for

gradient.horse

09.02.2026 23:11 πŸ‘ 7013 πŸ” 3611 πŸ’¬ 35 πŸ“Œ 154

Pretty shameful, Ms. Anand!

03.01.2026 22:29 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

congrats! last time we caught up you were i think just acquiring a much smaller electric boat... cool to hear you've been Scaling Up. is the cat in the water already?

03.05.2025 21:52 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

afaict you either need to argue that i've infringed by producing a copy of an article that i've never seen; or that the model creator infringed, and the model does "contain" a copy of the article in some sense, even though the model is definitely not "just" a copy of those inputs...

07.04.2025 03:32 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

suppose: the nyt example was for a open-weights model like llama; i get the model and recover an nyt article from it, like they demonstrated in court. i now have an illegal copy; where's the copy from?

07.04.2025 03:30 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

sure, happy to leave it here, and ultimately this is something a judge will decide as you say! but i will drop a last thought at the end here anyways since i already typed it up...

07.04.2025 03:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

sorry if i'm being pedantic! but this kind of hair-split is the sort of the thing the law cares about and i think the article is a little fuzzy on... πŸ˜…

07.04.2025 02:59 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

is it? in a section with a summary like "it’s still critical that training not involve copying", it seems relevant that quite a bit of copying happens in practice, and that it's hard to prevent.

07.04.2025 02:54 πŸ‘ 0 πŸ” 0 πŸ’¬ 2 πŸ“Œ 0

for sure, but "my system only copies a small percentage of (the ~entire internet)" and "i wish my system did not copy data so often" are not arguments that copying is not happening...

06.04.2025 00:57 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

good news! they shared the prompts: nytco-assets.nytimes.com/2023/12/Laws...

05.04.2025 20:33 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

and if they do it in public it can be copyright infringement!

05.04.2025 17:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

and i don't find the article's treatment of this super convincing: it agrees that all models do this, says it's bad, and then ignores it in the conclusions...

05.04.2025 17:11 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

i was happy to see you share this article; i think it's more right than most things written on this topic! but eg. when the nyt can get a model to spit out its articles nearly word for word, i think there's a pretty clear argument that a copy has been made and distributed...

05.04.2025 16:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

for ~all common models, it's quite easy to get an llm to spit out portions of its training data verbatim... hard to argue that distributing those models is not distributing that data in a legal sense!

05.04.2025 16:36 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

thanks for sharing! read the vulnerability report from citizenlab... looks like the issue was in the keyboard, and citizenlab still recommend using signal. (with all the security settings turned on!)

08.03.2025 17:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

oh hey congrats! i remember you were taking another swing at this - glad to see it over the line

18.11.2024 03:30 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0