More embedding models and an even more reliable inference engine is what you get with @hf.co Text Embeddings Inference v1.9.0 ๐ฅ
More in the thread ๐งต
More embedding models and an even more reliable inference engine is what you get with @hf.co Text Embeddings Inference v1.9.0 ๐ฅ
More in the thread ๐งต
Kernels now has an agent skill to write custom Hub kernels: huggingface.co/blog/custom-...
Awesome work by @benburtenshaw.bsky.social and Sayak Paul! ๐ฅ
And degoogle your phone.
kernels 0.12 is out! ๐
Changes:
* Support for kernel version branches to gracefully roll out kernel API changes.
* Support for PyTorch 2.10.
* kernel-builder is now merged into the kernels repo.
* Initial support for standardized kernel benchmarks.
github.com/huggingface/...
Zed has been great for me, is very fast, and has a single 'turn all AI off' toggle.
DeepSeek R1 dropped one year ago ๐ณ and a lot has changed.
With Irene Solaiman, weโre launching a blog series on
@hf.co about how that moment reshaped AI + open source in 2025, starting with strategic shifts and the explosion of new open models in China!
huggingface.co/blog/hugging...
๐ฅI am super excited for the official release of an open-source library we've been working on for about a year!
๐ชinterpreto is an interpretability toolbox for HF language models๐ค. In both generation and classification!
Why do you need it, and for what?
1/8 (links at the end)
T-Head, it uses a fork of the 0.7 draft of the RISC-V Vector extension .
๐ Look what ๐
has broght just before Christmas ๐: a brand new Research Master in Natural Language Processing at @facultyofartsug.bsky.social @rug.nl
Program: www.rug.nl/masters/natu...
Applications (2026/2027) are open! Come and study with us (you will also learn why we have a ๐ฎ in our logo)
We are currently doing a reading group on RISC-V and its vector extension. I actually got to implement it using the fast inverse square root because the T-Head board that we use does not have the vfrsqrt7.v instruction. So, full-circle I guess.
github.com/danieldk/low...
It started out as a joke with @kadarakos.bsky.social in 2022 when we worked at @explosion.ai that we should make an activation function using the fast inverse sqrt of Kahan/Walsh and famously used in Quake 3.
Benchmarks comparing RISC-V vectorized activation functions on a Milk-V Duo 256M. Dish is the fastest with 110M elements per second, followed by Swish with 57M elements per second and the slowest is the Cook GELU approximation coming in at 39M elements per second.
I finally made a page on my Dish activation function, replacing my deleted Tweet: danieldk.eu/Dish-Activat...
It's a non-monotonic function similar to GELU/SiLU, but does not require elementary functions, making it faster on various hardware.
I'll leave the empirical evaluation to someone else ๐.
Training LLMs end to end is hard. But way more people should, and will, be doing it in the future.
The @hf.co Research team is excited to share their new e-book that covers the full pipeline:
ยท pre-training,
ยท post-training,
ยท infra.
200+ pages of what worked and what didnโt. โคต๏ธ
Graph showing the conversion of Hugging Face repositories from LFS storage to Xet storage.
The Hub is on 100% on Xet. ๐
A little over a year ago, @hf.co acquired XetHub to unlock the next phase of growth in models and datasets. huggingface.co/blog/xethub-...
In April, there were 1,000 Hugging Face repos on Xet. Now every repo (over 6M) on the Hub is on Xet.
We made a blog post on how you can use kernel-builder to develop and build compute kernels for the @hf.co Kernel Hub:
huggingface.co/blog/kernel-...
Also a huge shout-out to @nixos-org.bsky.social! All the kernels in huggingface.co/kernels-comm... are built using kernel-builder, which uses Nix under the hood to build ABI3 kernels for all the supported Torch configurations (various CUDA/ROCm versions, Metal):
github.com/huggingface/...
Yesterday we released support for GPT OSS (the new OpenAI open weight model) across the @hf.co ecosystem. The latest Transformers now integrates support for the kernels package and uses kernels from the HF Kernel Hub to run models like GPT OSS as fast as possible. ๐
huggingface.co/blog/welcome...
David Holz made an introduction video showing how to make your own kernels with kernel-builder:
www.youtube.com/watch?v=HS5P...
The kernel ecosystem is completely open: you can make your own kernels with kernel-builder, upload them to the hub, and register a mapping using the kernels package and they get used by transformers.
github.com/huggingface/...
github.com/huggingface/...
Transformers 4.54.0 is out! This release adds support for compute kernels hosted on the Hub. When enabled, transformers can replace PyTorch layer implementations by fast, specialized kernels from the hub.
github.com/huggingface/...
Just released a new version of mktestdocs. It now also supports huggingface docstrings!
github.com/koaning/mkt...
Some of the ModernBERT team is back with new encoder models: Ettin, ranging from tiny to small: 17M, 32M, 68M, 150M, 400M & 1B parameters. They also trained decoder models & checked if decoders could classify & if encoders could generate.
Details in ๐งต:
So excited to finally release our first robot today: Reachy Mini
A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community.
Read more and order now at huggingface.co/blog/reachy-...
SUSE has released Cavil-Qwen3-4B, a fine-tuned, #opensource #LLM on #HuggingFace. Built to detect #legal text like license declarations, it empowers #devs to stay #compliant. #fast #efficiently. #openSUSE #AI #Licenses news.opensuse.org/2025/06/24/s...
Over the past few months, we have worked on the @hf.co Kernel Hub. Kernel Hub allows you to get cutting-edge compute kernels directly from the hub in a few lines of code.
David Holz made a great writeup of how you can use kernels in your projects: huggingface.co/blog/hello-h...
Hi Berlin people! @hugobowne.bsky.social is in town & we're celebrating by hosting a meetup together ๐ This one is all about building with AI & we'll also open the floor for lightning talks. If you're around, come hang out with us!
๐ June 16, 18:00
๐ Native Instruments (Kreuzberg)
๐๏ธ lu.ma/d53y9p2u
TGI v3.3.1 is released! This version switches to Torch 2.7 and CUDA 12.8. This should improve support for GPUs with compute capabilities 10.0 (B200) and 12.0 (RTX50x0 and NVIDIA RTX PRO Blackwell GPUs).
github.com/huggingface/...
@aob.nl mooie tijdslijn van de stakingen in het onderwijsblad, alleen de staking van 18 maart bij de @rug.nl vergeten, wel een beetje jammer!
We just released text-generation-inference 3.3.0. This release adds prefill chunking for VLMs ๐. We have also Gemma 3 faster & use less VRAM by switching to flashinfer for prefills with images.
github.com/huggingface/...