Very nice touch, Gmail π
Very nice touch, Gmail π
Part 2 of my journey building a smart home! π
In this part:
> ESPHome & custom component
> RF433 receiver & transmitter
> Hassio custom addon
Just published a new article on my blog πββοΈ
Building My Smart Home - Part 1: Plan, Idea & Home Assistant
Check it out!
Kudos to Google and the llama.cpp team! π€
GGUF support for Gemma 270M right from day-0
Watch it here: www.youtube.com/watch?v=Qtzz...
Richy Mini and SmolLM3 are featured in Github's weekly news! π π
Gemma 3n has arrived in llama.cpp π¨βπ³ π°
Comes in 2 flavors: E2B and E4B (E means "effective/active parameters")
See you this Sunday at AI Plumbers conference: 2nd edition!
π Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
π Register here: lu.ma/vqx423ct
β¨β¨ AIFoundry is bringing you the AI Plumbers Conference: 2nd edition β an open source meetup for low-level AI builders to dive deep into "the plumbing" of modern AI
π Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
π
When: June 15, 2025
π Register now: lu.ma/vqx423ct
Hugging Face Inference Endpoints now officially support deploying **vision** models via llama.cpp π π
Try it now: endpoints.huggingface.co/catalog
Real-time webcam demo with @huggingface.bsky.social SmolVLM and llama.cpp server.
All running locally on a Macbook M3
Although we have A100, H200, M3 Ultra, etc
Still can't match the power of that Casio FX π
llama.cpp vision support just got much better! π
Traditionally, models with complicated chat template like MiniCPM-V or Gemma 3 requires a dedicated binary to run.
Now, you can use all supported models via a "llama-mtmd-cli" π₯
(Only Qwen2VL is not yet supported)
Learn more: blog.ngxson.com/introducing-...
Finally have time to write a blog post about ggml-easy! π
ggml-easy is a header-only wrapper for GGML, simplifies development with a cleaner API, easy debugging utilities, and native safetensors loading β¨ Great for rapid prototyping!
Someone at Google definitely had a lot of fun making this π
And if you don't know, it's available in "Starter apps" section on AI Studio. The app is called "Gemini 95"
Telling LLM memory requirement WITHOUT a calculator?
Just use your good old human brain π§ π
Check out my 3βstep estimation π
Google having a quite good sense of humor π
Joke aside, 1B model quantized to Q4 without performance degrading is sweet π€
Cooking a fun thing today, I can now load safetensors file directly to GGML without having to convert it to GGUF!
Why? Because this allow me to do experiments faster, especially with models outside of llama.cpp π
No vibe coding. Just code it β
Visit my website --> ngxson.com
π
The Live Webinar will happen at
π 11 AM SF β 2 PM NYC β 6 PM London β 19h00 Paris
πππ Register here: app.getcontrast.io/register/sot... πππ
On Monday, the 24th, I'm proud to give a talk at sota's webinar.
My main talk will last for an hour to deep dive into the current state of on-device LLMs, exploring their advantages, trade-offs, and limitations.
The session will end with an Q&A, where you can ask me anything about this subject.
Had a fantastic chat today with Georgi Gerganov, the brilliant mind behind ggml, llama.cpp, and whisper.cpp! We discussed about:
π The integration of vision models into llama.cpp
π The challenges of maintaining a smooth UX/DX
π The exciting future of llama.cpp
Big things ahead - stay tuned!
OK now you are the best, Gememe 2.0
Yes, while waiting for the proper support, I made this temporary playground so that people can have an idea of what llama.cpp will become in near future :)