Jonathan Frankle's Avatar

Jonathan Frankle

@jfrankle.com

Chief AI Scientist at Databricks. Founding team at MosaicML. MIT/Princeton alum. Lottery ticket enthusiast. Working on data intelligence.

2,873
Followers
183
Following
33
Posts
26.05.2023
Joined
Posts Following

Latest posts by Jonathan Frankle @jfrankle.com

Post image

This is how it's done.

A strong and principled response by WilmerHale to the illegal Executive Order attack - a form of attempted government intimidation declared unconstitutional by a federal judge.

This is how to guard the rule of law.

28.03.2025 00:59 πŸ‘ 26335 πŸ” 6665 πŸ’¬ 441 πŸ“Œ 375
Preview
TAO: Using test-time compute to train efficient LLMs without labeled data LIFT fine-tunes LLMs without labels using reinforcement learning, boosting performance on enterprise tasks.

The hardest part about finetuning is that people don't have labeled data. Today, @databricks.bsky.social introduced TAO, a new finetuning method that only needs inputs, no labels necessary. Best of all, it actually beats supervised finetuning on labeled data. www.databricks.com/blog/tao-usi...

25.03.2025 17:19 πŸ‘ 35 πŸ” 6 πŸ’¬ 0 πŸ“Œ 2
A poster with three professor's pictures with the text "can your professors handle the heat"

A poster with three professor's pictures with the text "can your professors handle the heat"

Join @kumarde.bsky.social Bryan, and me in CSE tomorrow as we do Hot Ones for Academics. My normally spicy research takes will get even spicier

27.02.2025 01:38 πŸ‘ 8 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

Excited to share our work with friends from MIT/Google on Learned Asynchronous Decoding! LLM responses often contain chunks of tokens that are semantically independent. What if we can train LLMs to identify such chunks and decode them in parallel, thereby speeding up inference? 1/N

27.02.2025 00:38 πŸ‘ 16 πŸ” 9 πŸ’¬ 1 πŸ“Œ 1
Preview
Improving Retrieval and RAG with Embedding Model Finetuning Fine-tune embedding models on Databricks to enhance retrieval and RAG accuracy with synthetic dataβ€”no manual labeling required.

We're probably a little too obsessed with zero-shot retrieval. If you have documents (you do), then you can generate synthetic data, and finetune your embedding. Blog post lead by @jacobianneuro.bsky.social shows how well this works in practice.

www.databricks.com/blog/improvi...

26.02.2025 00:48 πŸ‘ 9 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0

In case it is not clear from my reposts, the Trump administration is engaged in an illegal AND unconstitutional to seize power over the federal government away from Congress and the courts. "Pausing" payment on the government's bills is just one part of it, but it is among the worst.

28.01.2025 16:55 πŸ‘ 24 πŸ” 7 πŸ’¬ 1 πŸ“Œ 0

Being right for the wrong reasons doesn't increase my confidence...

27.01.2025 21:48 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

All the more convinced that the markets don't understand AI. Both the irrational hype and the irrational pessimism. DeepSeek is incredibly bullish for GPU sales...

27.01.2025 21:20 πŸ‘ 10 πŸ” 1 πŸ’¬ 2 πŸ“Œ 0

Thank goodness for Greek letters!

22.01.2025 17:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Meta backs Databricks as the data analytics startup inches toward IPO Meta rarely invests in startups, but it works with Databricks on the Llama open-source models that Meta trains.

Very excited that our Series J is complete. Especially thrilled to have our friends at Meta on board!

22.01.2025 16:36 πŸ‘ 12 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Gives a new meaning to "Infrastructure Week"

22.01.2025 02:51 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Tweet from the Anti-Defamation League, or ADL:

This is a delicate moment. It's a new day and yet so many are on edge.
Our politics are inflamed, and social media only adds to the anxiety.
It seems that @elonmusk made an awkward gesture in a moment of enthusiasm, not a Nazi salute, but again, we appreciate that people are on edge.
In this moment, all sides should give one another a bit of grace, perhaps even the benefit of the doubt, and take a breath. This is a new beginning.
Let's hope for healing and work toward unity in the months and years ahead.
5:52 PM β€’ Jan 20, 2025

Tweet from the Anti-Defamation League, or ADL: This is a delicate moment. It's a new day and yet so many are on edge. Our politics are inflamed, and social media only adds to the anxiety. It seems that @elonmusk made an awkward gesture in a moment of enthusiasm, not a Nazi salute, but again, we appreciate that people are on edge. In this moment, all sides should give one another a bit of grace, perhaps even the benefit of the doubt, and take a breath. This is a new beginning. Let's hope for healing and work toward unity in the months and years ahead. 5:52 PM β€’ Jan 20, 2025

This is so bad

20.01.2025 23:08 πŸ‘ 8223 πŸ” 908 πŸ’¬ 1012 πŸ“Œ 694

I wasn’t expecting a nazi salute on day 1 but here we are. I of course understand that due to the palm on heart there’s plausible deniability but we all understand the intent

20.01.2025 21:44 πŸ‘ 43 πŸ” 2 πŸ’¬ 3 πŸ“Œ 0

Impressed by those able to talk about Deepseek right now.

20.01.2025 21:32 πŸ‘ 40 πŸ” 2 πŸ’¬ 2 πŸ“Œ 0
Preview
GitHub - databricks/Compose-RL Contribute to databricks/Compose-RL development by creating an account on GitHub.

Interesting Friday evening code drop from @rajammanabrolu.bsky.social and Brandon Cui at @databricks.bsky.social. That's all I'm allowed to say for now... github.com/databricks/c...

18.01.2025 02:47 πŸ‘ 13 πŸ” 1 πŸ’¬ 0 πŸ“Œ 1
Preview
Congestion Pricing Program in New York - MTA

Congestion Relief Zone tolling is now in effect.

Learn more: congestionreliefzone.mta.info

05.01.2025 05:01 πŸ‘ 1034 πŸ” 199 πŸ’¬ 24 πŸ“Œ 108

Absolutely loving the RM Twitter/Bluesky discourse.

01.01.2025 03:33 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

🧡 Super proud to finally share this work I led last quarter - the
@databricks.bsky.social Domain Intelligence Benchmark Suite (DIBS)! TL;DR: Academic benchmarks β‰  real performance and domain intelligence > general capabilities for enterprise tasks. 1/3

19.12.2024 16:25 πŸ‘ 5 πŸ” 4 πŸ’¬ 4 πŸ“Œ 1
Price of the brick going up

Price of the brick going up

Databricks raises $10b Series J at $62b valuation, the largest venture round ever.
www.databricks.com/company/news...

17.12.2024 15:29 πŸ‘ 11 πŸ” 2 πŸ’¬ 1 πŸ“Œ 1
Preview
Databricks is Raising $10B Series J Investment at $62B Valuation Funding led by new investor Thrive Capital Company expects to cross $3B in revenue run rate and achieve positive free cash flow in fourth quarter

The world needs data intelligence, and @databricks.bsky.social is delivering. Thank you to the investors who continue to support us on this journey. 🧱🧱🧱 www.databricks.com/company/news...

17.12.2024 16:15 πŸ‘ 14 πŸ” 3 πŸ’¬ 1 πŸ“Œ 0

Lastly, thank you as always to the amazing team at @databricks.bsky.social and the scientific and open source communities. You all keep me especially excited about the bright future we're creating. The folks at Meta, AI2, Eleuther, HuggingFace, Kaggle, among many many others.

13.12.2024 17:31 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

TLDR: See the TLDR at the top of the thread. Merry NeurIPS to everyone in the AI community. 2025 will be an exciting year β™₯οΈπŸ§±πŸ“ˆ

13.12.2024 17:31 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

11. My understanding of semiconductor progress is that, even it looks nice on a log/log plot, progress was never certain and hard-fought new ideas were always needed to get to the next step. If we knew how to get straight to 2nm, we wouldn't have done 65nm or 4nm.

13.12.2024 17:31 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

10. Next metaphor: Moore's "law." Gordon Moore wrote a great article in 2003 reflecting on that trend called "No Exponential is Forever...But Forever Can Be Delayed!" cseweb.ucsd.edu/classes/wi10...

13.12.2024 17:31 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

9. Even the worst case scenario for progress (model quality freezes forever at the level of GPT4++ and cost to use it keeps coming down) will lead to decades of new ideas and advances on top of AI that will leave the world transformed. Even the most bearish case is bullish.

13.12.2024 17:31 πŸ‘ 6 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1

8. But the world was transformed by the internet. It took decades of trial and error and experience and evolution and culture to make the most of it. Lower cost and greater access meant things that didn't make sense before (video sharing) later did. Imagine explaining "demure" to someone in 1995.

13.12.2024 17:31 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

7. I look to the history of computing. From what I understand, the technology behind the internet was largely fixed ("ossified" according to some) by the mid 90s. All that changed between then and now is that cost came down and access improved.

13.12.2024 17:31 πŸ‘ 7 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

6. Or maybe an incremental gain is enough to unlock extraordinary economic value that far outstrips the overall cost of building and using the model.

13.12.2024 17:31 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

5. Maybe scaling trends continue but incremental bumps on real tasks mean the models aren't worth deploying given inference costs. I don't have inside info, but my guess is this is why we don't have Gemini 1.5 Ultra or Claude 3.5 Opus. Of course someone tried to train them.

13.12.2024 17:31 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1