Nice write up comparing Numba, C++, and Mojo going from scalar, to SIMD, to GPU implementations. Written by the stringzilla author: ashvardanian.com/posts/scalin...
Nice write up comparing Numba, C++, and Mojo going from scalar, to SIMD, to GPU implementations. Written by the stringzilla author: ashvardanian.com/posts/scalin...
My handsome husband @bradlarson.bsky.social presents custom GPU kernels written in Mojo at their community meeting earlier this week:
youtu.be/XYzp5rzlXqM?...
But under the hood, we've built a generalized framework for programming accelerators, from a computational graph API in Python, to our multi-device kernels written in Mojo. It's worth noting that we use no CUDA libraries, and yet hitting the state-of-the-art on NVIDIA GPUs. AMD is coming soon.
We chose end-to-end serving of a large language model on NVIDIA A100 GPUs as our "steel thread" use case to prove out the core technology, it was pretty much the highest bar we could set for GPU performance: www.modular.com/blog/max-gpu...
I joined Modular at the beginning of this year because I was so excited about the company's vision for advancing heterogeneous computing. Today, we're launching our support for running ML models and more on GPUs: www.modular.com/blog/introdu...
There's been darkness this year (as recently as yesterday here in Madison), but I'm grateful for the opportunities I've had this year to work with great people, and for all the fun we've had as a family. It's been a joy to see many old friends on here.