No problem!
No problem!
Ah that's fair. I think the authors consider the topic of adversarial robustness to be out of scope: there are other papers that measure this better
- arxiv.org/abs/2409.14285
- arxiv.org/html/2501.03... (ours)
Oh nooo unfortunately the AI answer is confidently wrong. We don't use any perplexity or burstiness metrics. It's a specialized ML algorithm trained to detect AI/Human/AI-assisted with a setup that measures the magnitude of AI edits
arxiv.org/abs/2510.03154
I'm glad you cited this study, curious how you think it's flawed?
Founder of Pangram here to comment on a couple of points
- human writing flagged as ai? That's a big problem: we benchmark at 1 in 10,000 false positive rate and stand by it. Was this real world writing?
- we are 12 people. 2 sales, 1 marketing, and the rest of the team is technical
Pangram is releasing a new model today!
Pangram 3.2 is a significant improvement over 3.1 in several aspects
- Improved performance on 'humanized' text
- Improved performance on adversarial prompts
- Minimum word reduced from 75 to 50
- Overall improvement on Claude 4.6 recall
Wow this is so cool
Would 100% be possible to do a labeler based on reports. Probably too pricey for all content.
Bsky bot is still on my side project to-do list
ai dot com purchased for $70m.
METR evals going exponential.
former crypto grifters signalling we're in a bubble
big labs signalling recursive self improvement
I wonder, if the bubble pops, will it even matter?
I'm not seeing the self evident part of your argument
Me too. I really hope it's the former
I get why Google and meta don't always have the data to do this, but it's inexcusable when Amazon shows me ads for things that I just bought
I was going to write a whole effortpost about model evaluations but instead here, just take this chart
But overall, I believe that an open Internet and free technologies like ChatGPT will bring so good into the world, and ads are the only way to make tech like this universally free.
Yes, there are lazy and bad ways to do ads. One of the worst things companies can do is to show ads to already-paying customers. This happens most often when consumers have no alternative to a monopoly.
Digital ads have better targeting and attribution. Well-targeted ads cost more, meaning that more marketing dollars are exhausted on fewer ad impressions. A world where you see half as many ads because they are twice as relevant is a world in which everybody is happier.
Ads enable a powerful technology to become a global utility. They are the great equalizer: even the poorest user can pay for the service with their attention.
Digital ads are one of the greatest positive-sum technologies of the 21st century.
Google Search has 4 billion users, and they make on average $40 per user per year. This type of business model couldn't exist without ads; Google cannot find 4 billion people who would pay $40 for ad-free search.
Allow me to introduce cases where human intelligence is not fixed
Hurts my heart... Poor llm
Lmao
Pretty cool project on /r/localllama - they take human written text and sloppify it 10x with 4o-mini, then train the model to de-slop by reversing the transformation
extremely prescient
Thank you!
There are only two hard problems in computer science: building artificial superintelligence, and naming things
๐จ New Study ๐จ
@arxiv.bsky.social has recently decided to prohibit any 'position' paper from being submitted to its CS servers.
Why? Because of the "AI slop", and allegedly higher ratios of LLM-generated content in review papers, compared to non-review papers.
Thanks I appreciate that!
Omg I didn't even clock that
I will let you know when I try atproto! Surely it's better documented than the X api