Fun sonnet 4 hallucination on muP
The Yang-Lecun correspondence
30.05.2025 07:59
๐ 0
๐ 0
๐ฌ 0
๐ 0
Fun sonnet 4 hallucination on muP
The Yang-Lecun correspondence
Very happy to share the command-A tech report! I believe this the largest published model with muP+fp8 :)
Lots of interesting post-training details as well. And great performance ofc!
arxiv.org/abs/2504.00698
> spend some time porting critical code from c++ to python
> c++ code is slower than python
> After a while optimizing it, figure out you forgot to add -O3
> Runs much faster obviously
> At the end the python bindings eat up half of the runtime benefits
๐ฅฒ๐ข