The "kombucha girl" meme, where the left panel (with her frowning) stating "According to the proposition" and the right one (showing her interested face) stating "Lemma tell you"
Mathematical writing is my passion.
The "kombucha girl" meme, where the left panel (with her frowning) stating "According to the proposition" and the right one (showing her interested face) stating "Lemma tell you"
Mathematical writing is my passion.
like this?
Added a new symbols menu - let me know if I missed any of your favourite LaTeX commands!
I can't tell how much interest there is. But messages like this definitely encourage me to continue it!
I needed an easy way to make high resolution equations to post on Bluesky, so I made this: thomasahle.com/latex2png
> If NATO hadn't been trying to expand there, there would have been no war.
There would.
> If NATO stops trying to expand into Ukraine, the war ends.
It wouldn't.
> If the US stops sending weapons and fomenting anti-Russian sentiment, the war ends.
This war is about territory not sentiment.
You can play around with expectations of higher order Gaussians using the new
tensorcookbook.com/playground
Isserlis' (or Wick's) theorem is one of the strongest tools to handle High Dimensional Gaussians.
Turns out it generalizes to _every distribution_ using cumulant tensors!
That's higher order variance, skewness, kurtosis, etc.
I added a Playground to tensorcookbook.com for when you need that Matrix or Tensor Derivative in a hurry.
Hopefully it can also be a way to help people become familiar with tensor diagrams.
Now we're just waiting for a ZkiT model
Now live in a new Functions chapter in tensorcookbook.com
Some sketches for the next chapter
I added code execution to tensorcookbook.com so you can try tensorgrad's automatic tensor algebra without installing anything.
๐ Congratulations to Rasmus Pagh @rasmuspagh.net, the inventor of Cuckoo Hashing, and my PhD advisor, for becoming an ACM fellow! ๐
di.ku.dk/english/news...
Tensor Product Attention illustrated with Tensor Diagrams
Neat one-page proof of "Stirling's bound"
(n/e)โฟโ{2ฯ n} โค n! โค (n/e)โฟ(โ{2ฯ n}+1)
Inspired by the discussion on mathoverflow.net/a/458011/5429. Just had to keep hitting it with logarithmic inequalities...
Yes please!
Poisson Probability Puzzle:
Let X ~ Poisson(๐); Z = (X - ๐)/โ๐; Y ~ Normal(0, 1).
How close is E[|X|^k] is to E[|Y|^k]?
Say we connect ๐ and k by ๐ = c kยณ, what is now the limit E[|X|^k]/E[|Y|^k] as k โ โ?
This was harder to solve than expected, but the answer was surprisingly pretty ๐ป
"Central Limit Theorem" for the Poisson Distribution
A while ago Twitter removed the option of embedding your timeline on your website. Luckily, with Bluesky, I'm now able to put it back on thomasahle.com. Good to be back.
Can you refer me to the openai forum?
For more information on history heuristics in chess, see www.chessprogramming.org/History_Heur...
near future.
Time will tell if they'll update the entire network, or a smaller LoRA or side network.
Even chatbots like o1 could use TTT as an alternative to in context learning.
5/5
while searching. If two subtrees are conceptually similar, it has to do all the work twice.
Test Time Training fixes this!
If AlphaZero updated its weights while searching, it could transfer learnings between the subtrees!
I'm sure we'll start seeing a lot of TTT architectures in the near...
4/5
Obviously having a pretrained cnt[from][to] array wouldn't be helpful at all in chess, as moves may be good or bad entirely dependent on the position.
But because the butterfly table is reset at every search, it encodes "local information".
AlphaZero meanwhile doesn't learn anything while...
3/5
Chess engines like Stockfish will keep a so-called butterfly board, keeping track of how often a move was chosen in the search tree. _Independently of the position_.
This is data is considered elsewhere in the search tree to decide how much time to spend considering the move.
Why do this?
2/5
Test Time Training promises to finally unify learning and search. As always, chess is a good place to study such ideas:
AlphaZero generalized and simplified most of the tricks in chess engines like Stockfish, but one category is missing: history heuristics...
1/5
Your o1 supports images?
Making a wiki style website is a good way to do this, while encouraging others from. The community to contribute and keep it updated.
In fact, writing good Wikipedia articles for your field might be the best way to spread this knowledge.
Clever use of the KV-cache: Writing in the margins (arxiv.org/abs/2408.14906) at Neurips next week.
By "taking notes" as you read, ypu reduce the complexity from N^3 (N tokens at N^2 cost) to N^3/3 (1+4+9+...+N^2).