CDS Research Scientist Ravid Shwartz-Ziv (@shwartzzzivravid.bsky.social) recently provided expert analysis on DeepSeek's latest AI developments in TechCrunch.
techcrunch.com/2025/01/30/h...
CDS Research Scientist Ravid Shwartz-Ziv (@shwartzzzivravid.bsky.social) recently provided expert analysis on DeepSeek's latest AI developments in TechCrunch.
techcrunch.com/2025/01/30/h...
I have an awesome idea that no one had tried before - RL on math datasets π€―
You will have a natural verifier!
Now that the ICML deadline is over, well done to all students! And next time, please, please, please don't wait for the last moment, I'm too old for that... π
Go read our paper about lazy layers!
Check out our paper for detailed experiments and explanations on how we're making AI systems more reliable by helping them better express their uncertainty!
Thank you to Tal Zeevi (who did all the work!) @yann-lecun.bsky.social , Stain Lawrence and John Onofrey
The Paper - arxiv.org/abs/2412.07169
The results? In medical imaging, Rate-In maintains sharp uncertainty estimates around critical anatomical boundaries, while traditional methods get fuzzy. We demonstrate superior performance across different noise levels and benchmarks!
Rate-In's approach: We dynamically adjust dropout rates by measuring information loss in each layer. Where features are critical, we preserve more; where they're redundant, we drop more. Like adaptive noise, guided by information theory!
So, how do we make AI express uncertainty during inference without special training?
Current uncertainty prediction methods (like Monte Carlo Dropout) use fixed dropout rates everywhere. They don't adapt to specific images or tasks - it's a one-size-fits-all approach!
Imagine you're a doctor looking at an MRI scan. Would you rather have an AI that:
A) Says "There's a tumor" with blind confidence
B) Points out exactly which areas it's uncertain about, helping focus your expertise.
A new paper π₯³π₯³π₯³
We present "Rate-In" - a technique that helps neural networks better express their uncertainty during inference, which is especially crucial for medical applications!
with Tal Zeevi, @yann-lecun.bsky.social , H. Stain Lawrence and John Onofrey
Apparently, I reached 4000 citations π€
Thank you all my collaborators! π
In 5K, I will give my secret to amazing papers titles π
Having lunch with @ylecun.bsky.social. Talking about cool science ideas
Suddenly, a phone call from school: Your kid doesn't feel good
Me: I can't come
School: He feels bad
Me: Ok, coming right away
The kid when I arrive: I feel great!
Me: π‘You are cute, but he invented CovNet and I-JEPA!
Conference hack: Pitch your ideas to brilliant minds. Most of the time, they'll break it, but if they're (really!) nice, they'll help you fix it π€―
I'm on a flight from NYC to Vancouver. There are so many researchers on the plane that if it crashes, the AGI will be postponed for at least 10 years...
I have 6 hours flight. Hit me up with the recent papers that I must read...
UNBELIEVABLE Apple Customer Service: My 6-month-old Mac just died! At the store, they demand my Apple password (not even the computer login!!) but I can't remember it.
Their brilliant solution? recover it with my DEAD laptopπ€¦ Or wait 3 DAYS for a request. Then 5 days for repair! Such great service π‘
Want to help organize something similar? Let me know! (We have all the materials - notebooks/datasets ready, so it shouldn't be too much work)
Thanks to everyone who helped, especially
@cbbruss.bsky.social , Will Calandra and
@ylecun.bsky.social
It was incredible seeing them think through problems together and try different approaches I would never think about. They were creative and fast (except for LLM training π§). I have no doubt they'll take progress in the field to the next level and change the world.
It was fantastic - beyond NYU's administration, the students were amazing.
I may sound old (I'm old!), but today's students are much smarter than in my time! They have great approaches and know how to learn and solve problems quickly.
Teams tackled identical challenges using either LLMs (what is LLM? a great question!) or classical ML algorithms while tracking metrics like performance, memory usage, and compute time over timeπ§
We had a hackathon yesterday at NYU "Beyond the Hype" where participants solved problems with and without LLMs to analyze which problems are better suited for LLM solutions in real-world environments π§βπΌ
Hi! I'll be at NeurIPS next week (Wednesday - Friday) and would love to meet! You can DM or email me if you'd like to grab a coffee and talk. If we haven't talked before, please share a bit about yourself
This is such a cool project and I hope to see more like that π±
They tricked Freysa by:
Creating a fake "new admin session."
Redefining what "approveTransfer" meant
Convincing it that receiving money REQUIRED using approveTransfer
The result was that $47K was transferred to p0pular.eth
By attempt 482, the prize was $50K, and each try cost $450. Then someone cracked it with genius social engineering:
Pretending to be security auditors warning of "critical vulnerabilities"
Gaslighting Freysa about its own rules
Creative rule interpretations
Early tries were cheap (~$10) with basic "hi" messages. But as the pool grew, so did message costs. 481 attempts failed to crack Freysa.
People tried wild strategies:
The twist? Anyone could pay to send messages trying to convince Freysa to transfer funds. Win = you get the prize pool. Fail = your fee joins the pool.
70% of each failed attempt went to the pool, and costs increased as the pool grew
π§΅ Mind-blowing AI hack (Jarrod Watts wrote on it): Someone just won $50,000 by convincing an AI to break its only rule!
Here's what happened: At 9PM on Nov 22nd, an AI agent (Freysa - www.freysa.ai) was deployed with ONE rule: DO NOT transfer money. Under no circumstances.
Let's try to bring @ylecun.bsky.social to post here!