i also think that masked data language models learning how texts really corrupted could be the future of entity resolution.
i also think that masked data language models learning how texts really corrupted could be the future of entity resolution.
i'm always saying this.
active generalized category discovery with a steorts-style microclustering prior seems like the future of entity resolution.
no offense but there are too many "labor labs"
man, labor notes is sold out already.
Graphic advertising a happy hour sponsored by the Indianapolis NewsGuild and the NewsGuild at NICAR 2026. The happy hour will run from 6-8 pm March 5 at O'Reilly's Irish Pub and Restaurant
Are you a union member or want to be and you're at #NICAR26?
The Indy NewsGuild and @newsguild.org are sponsoring a happy hour tonight from 6-8pm downtown. (appetizers provided!)
it was not as good as what i had, but it was still wonderful.
i had banitsa for the first time a week ago, and it was amazing. tonight i'm making it at home, and i had not previously appreciated amount of cheese and butter.
Assorted stickers and documents on a desk. One group of stickers says "libfec" in green font. The other group says "Datasette" in purple. One document stack has the title "NICAR26 libfec manual". The document says "libfec cheatsheet".
I will be at #NICAR26 this week teaching classes!
For campaign finance folks: come to my Thursday 11:30am class on libfec, a new fast CLI tool for working with FEC data!
Also get limited-edition libfec stickers + a campfin "zine"!
(Bonus: find @simonwillison.net and I for Datasette stickers)
i do not doubt that huge parts of the quantitative social sciences are potentially automatable, partially because they already seemed as if they were.
really appreciate what @theorangeone.net has enabled!
go look at the moon
someone who has ever used version control should invent a .po file format
that is a word.
within about 10 iterations it was doing over 99% accuracy. first few iterations missed important corner cases.
this exchange got me to set up a loop for a situation where i ground truth labels for thousands of very ambiguous short text identifiers of labor unions. i had claude draft a prompt for an subagent to find the real union in a local db, return its results, score it's accuracy, and refine the prompt.
neither one of those is my position.
a skill that expertly executes the dominant, expert-level search strategies for legal research is a useful skill! but it's not the same thing as covering the space of legal research.
RL would seem to have same problem it could learn from your feedback on one legal research question, but it's unlikely to cover the space unless the examples you give it cover the space.
a skills document that captures all that variation would be very very long because it would have captured all the different cases, but it could be general to problems in the domain.
let's posit that when claude does those procedures it's near expert level at it's executing them.
but there are legal research questions where those procedure will not work well, and bruenig will go to some other strategy or invent a new one.
the thing is that experts don't have just one procedure. let's posit that the skill bruenig wrote does capture the dominant procedure and maybe even a few fallback procedures he uses when the first doesn't work.
RL should soften but not eliminate the domain specific generalization problem.
most abstractly, these procedures encoded in skills can introduce their own generalization errors *within the target domain*
the instruction set of these skills seem like the same kind of problem (indeed i have experienced just this generalization problem for other skill like things i have built), and trusting that the problem can be effective with its appropriate domain is exactly the concern.
in my experience, it's just very easy to write tools that solve the problems you know the answer to that do not generalize to the ones you don't. a simple example writing a regex that solves all the cases you know about is almost certain going to fail on the ones you don't.
the thing i worry about building these kinds of tools is that they they are "overtrained" on the things that the builders know a lot about. like bruenig knows some parts of labor law a lot better than others and he adjusted the skill until it worked well on the parts he knows about.