Get inspired, follow the fine-tuning guide, and build! x.com/ben_burtens...
Get inspired, follow the fine-tuning guide, and build! x.com/ben_burtens...
- Model: huggingface.co/google/func...
- MLX quants by @Prince_Canuma: huggingface.co/collections...
- Amazing game by @xenovacom: huggingface.co/spaces/webm...
Google's FunctionGemma is out ๐ฅณ
smol ๐ค 270M (not B!) parameters model. Why is this interesting?
๐จ Designed for tool calling.
๐ฒ Perfect for on-device use.
๐ Dramatically increases performance on your domain with fine-tuning.
JetBrains has been quietly building something special for the open-source LLM community. More details will be posted soon on Hugging Face. Stay tuned! ๐งโ๐ป
Love it! Are you planning to transfer the sense of humor as well? ๐
I used to disable Sleep in the settings, but allowing the display to go off; now it looks like that combination is no longer possible :(
Announcing Global-MMLU - an improved MMLU Open dataset with evaluation coverage across 42 languages.
The result of months of work with the goal of advancing Multilingual LLM evaluation.
Built together with the community and amazing collaborators at Cohere4AI, MILA, MIT, and many more.
Impressed by this space! Feed it a pic, describe your dream setting, and transform scenes instantly.
Check this lunar rover transforming into a cinematic moonscape with Earth hanging majestically in the sky! ๐๐ #AIart #DigitalArt
Try it out: huggingface.co/spaces/Yuans...
GIF of me scrolling through the LLMOps database website
๐ค Do you ever wonder how companies are putting LLMs and GenAI apps into production? What stacks do they use? What architecture did they go with?
I put together a database of known public technical writeups with summaries of the key technical features.
The amazing, new Qwen2.5-Coder 32B model can now write SQL for any @hf.co dataset โจ
So many open-source and open releases last week!
Here's a recap, find the text-readable version here huggingface.co/posts/merve/...
Excited to see that PrimeIntellect/INTELLECT-1-Instruct is the first non-Ai2 model to train on parts of the Tulu 3 datasets/recipe. Took about 1 week ๐
https://buff.ly/3Zjmako
Congrats!
๐จ Love this new colorization tool! Upload your B&W photos, pick a model, and watch them transform into vibrant masterpieces. It even auto-generates captions! Perfect for bringing old memories to life in full color โจ
Kudos to @fffiloni.bsky.social
Try it out: huggingface.co/spaces/fffil...
I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:
1๏ธโฃ Understanding tool calling with Llama 3.2 ๐ง
2๏ธโฃ Using Text Generation Inference (TGI) with Llama models ๐ฆ
(links in the next post)
This is insane! Structured generation in the browser with the new @hf.co SmolLM2-1.7B model
โข Tiny 1.7B LLM running at 88 tokens / second โก
โข Powered by MLC/WebLLM on WebGPU ๐ฅ
โข JSON Structured Generation entirely in the browser ๐ค
We just deployed Qwen/QwQ-32B-Preview on HuggingChat! It's Qwen's latest experimental reasoning model.
It's super interesting to see the reasoning steps, and with really impressive results too. Feel free to try it out here: huggingface.co/chat/models/...
I'd love to get your feedback on it!
Fuck it! Structured Generation w/ SmolLM2 running in browser & WebGPU ๐ฅ
Powered by MLC Web-LLM & XGrammar โก
Define a JSON schema, Input free text, get structured data right in your browser - profit!!
FYI, I muted this conversation, I am blocking some users and reporting others because since yesterday I am receiving death threats and plenty of harassment. For something I didn't do. Wasn't Bluesky different than other social networks?
I am no longer willing to engage in this conversation.
Thank you to the @neuripsconf.bsky.social for this recognition of the Generative Adversarial Nets paper published ten years ago with @ian-goodfellow.bsky.social, Jean Pouget-Abadie, @memimo.bsky.social, Bing Xu, David Warde-Farley, Sherjil Ozair and Aaron Courville.
blog.neurips.cc/2024/11/27/a...
A librarian that previously worked at the British Library created a relatively small dataset of bsky posts, hundreds of times smaller than previous researchers, to help folks create toxicity filters and stuff.
So people bullied him & posted death threats.
He took it down.
Nice one, folks.
TIL you can see which lists you belong to in bsky, and it seems I've been blocked by 150 people already due to my post yesterday ๐ช
I'll keep hoping for a collaborative and kind space where empathy rules rather than polarization and violenceโค๏ธ
clearsky.app/osanseviero....
I'm disheartened by how toxic and violent some responses were here.
There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.
Weโre looking for an intern to join our SmolLM team! If youโre excited about training LLMs and building high-quality datasets, weโd love to hear from you. ๐ค
US: apply.workable.com/huggingface/...
EMEA: apply.workable.com/huggingface/...
The (non-exhaustive) evolution of base models
If you want to learn more about it and how to use these models, check out the freshly released book "Hands-On Generative AI", written with @pcuenq.hf.co @apolinario.bsky.social and Jonathan
www.oreilly.com/library/view...
OLMo 2 is out ๐ฅณ 7B and 13B trained on 5T tokens, and meticulousy instruction tuned using Tulu 3 recipe.
Simply the best fully open models yet.
Really proud of the work & the amazing team at
@ai2.bsky.social
More info:
โฐ๏ธ Andi's post bsky.app/profile/andi...
๐ Blog post huggingface.co/blog/smolvlm
๐๐ป Models huggingface.co/collections/...
๐ฎ HF Demo huggingface.co/spaces/Huggi...
๐จ mlx-vlm PR [WIP] github.com/Blaizzy/mlx-...
SmolVLM was just released ๐
It's a great, small, and fully open VLM that I'm really excited about for fine-tuning and on-device use cases ๐ป
It also comes with 0-day MLX support via mlx-vlm, here's it running at > 80 tok/s on my M1 Max ๐คฏ
Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! ๐ฅ
> Multilingual - English, Chinese, Korean & Japanese
> Cross platform inference w/ llama.cpp
> Trained on 5 Billion audio tokens
> Qwen 2.5 0.5B LLM backbone
> Trained via HF GPU grants
A screenshot of LightEval benchmarking results in a terminal
Check out how easy it is to do LLM evals with LightEval!
* any dataset on the ๐ค Hub can become an eval task in a few lines of code: customize the prompt, metrics, parsing, few-shots, everything!
* model- and data-parallel inference
* auto batching with the new vLLM backend