Models, checkpoints, training code under Apache 2.0.
π§βπ³ Kudos to the whole team @nohtow.bsky.social Luca Arnaboldi @amelietabatta.bsky.social @krzakalaf.bsky.social
π Dive into the release: www.lighton.ai/lighton-blog...
Models, checkpoints, training code under Apache 2.0.
π§βπ³ Kudos to the whole team @nohtow.bsky.social Luca Arnaboldi @amelietabatta.bsky.social @krzakalaf.bsky.social
π Dive into the release: www.lighton.ai/lighton-blog...
π₯ SOTA on BEIR, <150M params
β‘ Supervised-first β distill = most of the gains for a fraction of the cost
π§ Prompt alignment is non-negotiable to preserve peak performance through fine-tuning
In collaboration with @epfl-ai-center.bsky.social and the Swiss AI initiative, LightOn pre-trained it end-to-end for late-interaction retrieval
Day Zero for Multi-Vector Retrieval.
Today weβre flipping the retrieval playbook: no dense model adaptation, no retrofit.
ποΈMulti-vector from scratch, powered by PyLate.
Meet ColBERT-Zero
Give your coding agent the search it deserves.
Huge kudos to @nohtow.bsky.social and @raphaelsty.bsky.social
Read more: www.lighton.ai/lighton-blog...
What we measured with Claude Code:
π 70% win rate vs. vanilla grep
π ~60k tokens saved per question
π€ 56% fewer search operations
Built in Rust with Next-Plaid - 100% local - No code leaves your machine.
ColGrep is powered by LateOn-Code-edge (17M) and LateOn-Code (130M), the first late-interaction models purpose-built for code.
π They top MTEB Code,
outperforming models up to 17x their size while running instantly on a laptop.
ColGrep mirrors the grep interface your agents already use, but replaces pattern matching with semantic scoring, and supports hybrid queries that combine both. It plugs straight into Claude Code, OpenCode, or Codex.
π₯ Stop burning tokens on blind grep searches. Give your coding agent semantic eyes.
Meet LateOn-Code & ColGrep:
a Rust-powered search tool and two SOTA late-interaction models that bring intent-level code retrieval directly to your terminal.
Huge kudos to @raphaelsty.bsky.social for shipping this breakthrough! π
Read the full article here π www.lighton.ai/lighton-blog...
NextPlaid represents the "Blanc" milestone in our Bleu/Blanc/Rouge roadmap for enterprise document intelligence. It follows the "Bleu" release, LightOnOCR-2, a SOTA 1B OCR model which converts complex documents into clean, usable text.
βοΈ Production Ready:
Built in Rust and optimized for CPUs, it supports incremental index updates and concurrent reads/writesβcapabilities missing from standard implementations.
π Seamless Integration:
NextPlaid runs alongside your existing vector database. You can add multi-vector retrieval to your established RAG pipeline without ripping anything out.
π Frugal Inference: High-signal context reduces the amount of noise sent to your LLM, allowing it to answer with fewer, more accurate tokens.
Why NextPlaid is the missing layer for your RAG stack:
π― Precision Matching: Retrieval matches at the token level, surfacing the exact passage that answers your question rather than just a document that vaguely relates.
By representing documents as sets of vectors, one per token, we preserve the distinct concepts and precise details that other search engines average away.
ππͺ‘To find the needle, you better index every straw of the haystack.
Today, LightOn is launching LightOn NextPlaid: a CPU-optimized multi-vector database that indexes at the token level.
En entreprise :
π vos documents sont vivants,
π lβobservabilitΓ© est indispensable,
π³ le bruit coΓ»te cher et les GPUs ne poussent pas sur les arbres !
Un Γ©pisode dense et sans langue de bois sur l'IA en entreprise.
π§ Γcouter l'Γ©pisode
π Spotify : open.spotify.com/episode/4Dtt...
@amelietabatta.bsky.social Head of Knowledge & Search chez @lightonai.bsky.social est lβinvitΓ©e de Laurent Nicolas-Guennoc pour le podcast Converteo βChangement dβΓ©poqueβ
Face au narratif "bigger context = better", AmΓ©lie remet les pendules Γ l'heure.
ποΈβMettre tous vos documents dans le contexte d'un modΓ¨le, c'est comme inviter 30 personnes Γ une rΓ©union oΓΉ une seule suffit : Γ§a coΓ»te cher, Γ§a fait du bruit, et au final le rΓ©sultat est moins prΓ©cis !β
Congrats to @orionweller.bsky.social @jhuclsp.bsky.social @nohtow.bsky.social for pushing the boundaries of useful AI.
π§βπ³ Read the open recipe here: lighton.ai/lighton-blog...
Size matters less than the right architecture choice.
Thatβs why the smallest Ettin model is already being massively adopted to build high-performance Edge AI.
Itβs time to stop forcing "Decoder-only" models on every problem. For high-value tasks, specialized engineering beats generic scale.
Ettin was built as the first-ever SOTA suite of paired encoder-only & decoder-only models to prove a point:
π Encoders for classification & retrieval
βοΈ Decoders for text generation
The Ettin suite paper has been accepted to
@iclr-conf.bsky.social
It highlights the Elephant in the room:
ποΈ Architecture matters.
The "G" in RAG only amplifies what the "R" provides.
If your retrieval layer is static, your AI is hallucinating on facts.
Here is how LightOn approaches RAG as critical infrastructure, not just a chatbot feature.
π www.lighton.ai/lighton-blog...
When you treat it as a simple add-on:
π Relevance drops as document versions change.
π Security blocks you because access control wasn't enforced at query time.
β οΈ Trust erodes because the system generates confident answers based on last week's data.
Stop building RAG like a feature. It is infrastructure.
RAG inherits every constraint of your organization: scale, heterogeneous data, and strict governance
This makes high-performance OCR far more accessible for edge deployments, privacy-sensitive use cases, and cost-efficient production setups.
π Model weights available here huggingface.co/lightonai/Li...
π¦No GPU required
π» Runs locally on a laptop (CPU-friendly)
π₯ SOTA performance on your data: Fine-tune easily using standard Hugging Face tooling LoRA, PEFT, Trainer
π€ LightOnOCR-1B is now in Hugging Face Transformers
With 1.2B downloads, Transformers Library is the go-to toolkit for developers building AI applications.
Any developer can now add state-of-the-art document reading to their app in one line of code.
#Transformers #OCR #GenAI