scoff manifesto's Avatar

scoff manifesto

@andimgladofit

parody. non actionable. all posts are financial advice

60
Followers
20
Following
124
Posts
10.11.2024
Joined
Posts Following

Latest posts by scoff manifesto @andimgladofit

you can check this yourself on gpt-oss 20b with High reasoning in the prompt vs low reasoning in the prompt and any entropy reduction method. Or if you have access to an 80 gig card, ablate the 120b.

28.11.2025 06:10 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

now that i have sold out and started working on these: it is because the big labs figured out local entropy reduction techniques are very effective, and they aggressively tune that knob

28.11.2025 06:07 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
WeatherNext 2: Our most advanced weather forecasting model The new AI model delivers more efficient, more accurate and higher-resolution global weather predictions.

completely unrelatedly, i am now ~fully convinced that there isn't a single real world smooth mapping that you can't capture by diffusing the correct amount in the correct space

blog.google/technology/g...

17.11.2025 22:43 ๐Ÿ‘ 4 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

at the same time, different channels will have different overall power spectra (that a full rank representation preserves) and so good latents must be doing some sort of spatial mixing directly, and the diffusion models must untangle that and step *down* in dimensionality while increasing dof

27.08.2025 01:06 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

because any information noised in the forward process cannot be seen later, these models always encode a hierarchical series of representations. but latent space is much closer to full rank than the target data manifold (a perfect one would be exactly full rank)

27.08.2025 01:02 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

there is. in the continuous limit the models learn the target score of the conditional distribution. but the forward process is a gaussian perturbation kernel so the step between any two diffusion times is a white noise, so high frequency modes must drop (exponentially) faster

27.08.2025 00:58 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

lmao that these might print

05.06.2025 19:46 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

this is easily one of the top 3 worst trades in nba history. david stern is furiously fighting his way out of hell to stop this

02.02.2025 11:55 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

shoutout to deepseek, showing you can just bolt on CoT with direct rl if your base model is good enough

27.01.2025 03:56 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

california basically needs to remove its entire regulatory state at this point or the people are going to elect democrat hitler

09.01.2025 18:59 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

hard to say with insurance its very regulated and im not an insurance guy. the broad issues are fraud and states making it unprofitable to service. definitely more parametric structures in the policies, but idk if those are even legal to offer to consumers (mb bypasses california's idiot laws?)

09.01.2025 18:46 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

there are already hurricane binary options, but there are some otc parametric structures (so like wind pressure, rainfall). i'm sure somebody has something similar for fire, but that is still otc. the big volume rn are temperature based contracts (LNG hedge)

09.01.2025 18:41 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

unrelated: you see this paper openreview.net/pdf?id=gojL6...

if this works on fluid dynamics im gonna lose my shit

08.01.2025 07:18 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

resolving to reply to more posts with lol in 2025

08.01.2025 06:32 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

this person is going to the reeducation camps when i take control

31.12.2024 02:30 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

i thought about doing something like this for fusion operations with triton or MLIR, but i think that's actually just a full phd topic of work because i'd need to develop some sort of proof engine for it

29.12.2024 05:02 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

some lab needs to give me 50000 h200s so i can implement an implicit runge kutta token sampler that costs 3.5 million dollars per inference run and outputs "i don't feel like doing that right now" 50% of the time

28.12.2024 18:01 ๐Ÿ‘ 8 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

if your children don't venerate Urkel thought theyre ngmi

28.12.2024 17:55 ๐Ÿ‘ 12 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

yeah it's basically greenfield and its the sort of problem where throwing money into a furnace gets you better solutions for a while

28.12.2024 17:52 ๐Ÿ‘ 6 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

not going to speak for him, but at least in terms of "make my llm bigger and deeper" it's unlikely that going from 600bn to 6T model size with autoregressive LLMs gets you even a 20% better model

there's a lot of room in inference compute though. imo mindless scaling isn't dead for at least 5 years

28.12.2024 09:20 ๐Ÿ‘ 12 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

its also a pretty good book! not as a reference manual, but a good introduction

27.12.2024 20:50 ๐Ÿ‘ 5 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Calculus of Variations and Optimal Control Theory: A Concise Introduction by Daniel Liberzon

Calculus of Variations and Optimal Control Theory: A Concise Introduction by Daniel Liberzon

i will stop flaming you when you read this book

27.12.2024 20:02 ๐Ÿ‘ 13 ๐Ÿ” 0 ๐Ÿ’ฌ 3 ๐Ÿ“Œ 0
Post image

lol

27.12.2024 02:53 ๐Ÿ‘ 118 ๐Ÿ” 9 ๐Ÿ’ฌ 5 ๐Ÿ“Œ 1

genuinely shocked you didnt have it already lmao. my family loves making fun of me because i can pack all my material possessions into 4 boxes and move within 4 hours, and i still had one of their dutch ovens

25.12.2024 21:08 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

o3 arc just being optimal control style value iteration over token trajectories is a really funny way to blow up the agi foom cranks though

25.12.2024 01:01 ๐Ÿ‘ 4 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

the only mistake i made by being stochastic control theory brained was not leaning into it 50x harder

25.12.2024 00:58 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

i know i have insane bay area brain when im looking at a 30y 800k 7% and thinking 'huh that's reasonable'

23.12.2024 00:06 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

they cant really even install cuda they just pip install torch, huggingface and pray the installation doesnt detonate

22.12.2024 23:57 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

not mine im built different

22.12.2024 23:55 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

anthropic does more and longer multi round in both training and rlhf

22.12.2024 23:54 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0