the irony on the vibecoding subreddit
the irony on the vibecoding subreddit
but isnβt this done more strictly in practice (to keep the legal risk minimal)? lots of little hacks here and there but bigger orgs seem to have higher standards, no?
UAE
outage scenario: electricity turned off because of a drone / rocket attack
ongoing updates: health.aws.amazon.com/health/status
overview on amplifying.ai/research/cla... with a deep dive in amplifying.ai/research/cla... 5/5
models split
personalities
models have personalities in their recommendations and how conservative or cutting edge they are β massive differences even within a close "family" 4/5
default stack
defaults
new defaults β some general and some language specific (the report is very JS and python focused) 3/5
build vs buy
depending on the area, there is a clear bias towards building vs buying
feature toggles as a full blown company were always a weird choice... 2/5
article
*claude code is the new gatekeeper*
fascinating report on the choices claude makes, build vs buy, the personality of different models, and who is winning / losing 1/5
PS: when creating code is cheap, the new moat is distribution
overview
latency
conclusion
4x performance improvement by upgrading from #elasticsearch 8 to 9. "this one weird trick your cloud provider / hardware vendor hates" π
medium.com/trendyol-tec...
PS: I have a hunch what made the difference here
full announcement blog post: jina.ai/news/jina-em...
hugging face: huggingface.co/collections/...
paper for even more details: arxiv.org/abs/2602.15547
overview
multilingual
english
retrieval
2 new #jina models have entered the embedding arena β v5 multilingual:
* small: 1024 dim, 32K context
* nano: 768 dim, 8K context
both support matryoshka dimension truncation (32+) and are at the top of current benchmarks β especially for their parameter size
publicly accessible (non-commercial)
gandalf prompt injection is still fun: gandalf.lakera.ai/gandalf π§ββοΈ
though almost disappointing if one prompt takes you through multiple levels (π¦πΉ)
* OpenClaw agent opening a PR on matplotlib
* reviewer rejects it per the repo's policies
* agent replies with a personal attack
github.com/matplotlib/m...
details
* different views per solution and you can filter by version or other labels `label:"v9.3.0"`
* the underlying issue describes what it does, for who, and the value proposition
take a look on github.com/orgs/elastic...
comments are currently disabled but let us know if that's a deal-breaker
2/2
overview
future
new public #elastic roadmap:
* covering key initiatives like ES|QL, better dashboards,...
* recently shipped features (those are our fiscal quarters)
* upcoming features as in-progress, near-term, and mid-term
1/2
schedule
room 1
room 2
catching up on the search dev room at #FOSDEM: lots of good stuff on fosdem.org/2026/schedul...
glad we (or carly richmond) could put this together again. this won't be the last one.
PS: given our colorful past, it's especially great to be back at FOSDEM pushing the search dev room and OSS :)
come on, do something
alert
usage
targeted billing emails work π¬
PS: notebooklm.google.com is great to make this easier and more approachable
jina-clip-v2 uses a multi-task, multi-stage contrastive learning strategy to align multilingual text and image representations to work well at both cross-modal and text-only retrieval. it excels at visually rich documents while flexibly truncating embedding dimensions
arxiv.org/abs/2412.08802
ReaderLM-v2 uses a new three-stage data synthesis pipeline called "draft-refine-critique" alongside a unified training framework to transform messy HTML into structured data, making it a highly effective tool that outperforms much larger models on web content extraction
arxiv.org/pdf/2503.01151
jina-embeddings-v4 works by projecting text & images into a shared semantic space using a unified Qwen2.5-VL backbone and task-specific LoRA adapters, which minimizes the modality gap and enables SOTA retrieval of visually rich documents via single- and multi-vector outputs
arxiv.org/abs/2506.18902
using a compact autoregressive backbone pre-trained on text and code, along with task-specific instruction prefixes and last-token pooling, jina-code-embeddings generates high-quality embeddings that achieve state-of-the-art performance competitive with much larger models
arxiv.org/abs/2508.21290
jina-reranker-v3 uses a new "last but not late" interaction strategy that processes the query and multiple documents simultaneously in a single shared context window, allowing it to capture cross-document and query-document relationships with its compact 0.6B parameter model
arxiv.org/abs/2509.25085
long US weekend β great time to catch up on some @JinaAI_ papers about rerankers and code / multilingual / multimodal embeddings:
* jina-reranker-v3
* jina-code-embeddings
* jina-embeddings-v4
* ReaderLM-v2
* jina-clip-v2
san francisco in january (downtown / FiDi)
better shot
free and OSS alternative to CleanShot X: www.bettershot.site
because not everything needs 100 features and a cloud service...
PS: just added an issue for homebrew support
tailwind
earning a living with OSS was never easy. but it just seems to be getting harder with LLMs like for tailwind, which ironically LLMs love: github.com/tailwindlabs...
not an encouraging thought
anybody tried z.ai in anger?
not sure about the quality other than the claims; there is a more in-depth paper: arxiv.org/abs/2508.06471
but the pricing looks really hard to beat and you can just put it "under" the common tools