Lintang Sutawika's Avatar

Lintang Sutawika

@sutawika.com

PhD @ltiatcmu.bsky.social previously @eleutherai.bsky.social 🌐 lintang.sutawika.com

2,190
Followers
219
Following
13
Posts
13.07.2023
Joined
Posts Following

Latest posts by Lintang Sutawika @sutawika.com

Post image

Can you train a performant language model using only openly licensed text?

We are thrilled to announce the Common Pile v0.1, an 8TB dataset of openly licensed and public domain text. We train 7B models for 1T and 2T tokens and match the performance similar models like LLaMA 1 & 2

06.06.2025 19:18 πŸ‘ 147 πŸ” 59 πŸ’¬ 2 πŸ“Œ 2

Damn, where are these parties i’m missing πŸ˜‚

09.03.2025 02:01 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Technically, we do but a lot of that goes paying tuition. Not unlike the 20k for these agents going towards GPU compute πŸ€ͺ

07.03.2025 03:58 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Maybe he thought it was β€œlocker room talk” πŸ€ͺ

17.02.2025 23:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Google β€œWe Have No Moat, And Neither Does OpenAI” Leaked Internal Google Document Claims Open Source AI Will Outcompete Google and OpenAI The text below is a very recent leaked document, which was shared by an anonymous individual on a public Disc…

Feels like a great time to re-share this

semianalysis.com/2023/05/04/g...

28.01.2025 05:13 πŸ‘ 9 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

They're future-proofing the design 😎

02.12.2024 05:58 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

The `decision model n` is being directed by mission control and then forwards a signal to `big data`?? I guess no decision was ever made πŸ˜‚πŸ˜‚πŸ˜‚

02.12.2024 05:23 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Maybe. But probably more likely, they're using QwQ or Deepseek.

02.12.2024 04:21 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Transformers demonstrated how to attend an entire sequence length which at the time was different to many approaches like LSTM that processed tokens sequentially. The attention span across the whole sequence does parallel the aliens from Arrival.

01.12.2024 16:13 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ™‹β€β™‚οΈ

24.11.2024 00:23 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Attended 2 different lectures (1 class and 1 invited guest lecture) with the similar topic of inference-time scaling. Maybe the matrix is trying to tell me something.

22.11.2024 02:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Lectures in #nlp I see that use Taylor Swift to illustrate concepts.

21.11.2024 20:44 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

@eleutherai.bsky.social is our official account. Will be posting here and on Twitter from now on.

20.11.2024 14:18 πŸ‘ 20 πŸ” 3 πŸ’¬ 2 πŸ“Œ 0

LTI PhDs seeking refuge in Bluesky
go.bsky.app/NhTwCVb

07.11.2024 16:46 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hi, I would also like to be included in this list!

07.11.2024 16:45 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0