Resources:
arXiv: arxiv.org/abs/2603.04454
Github repo: github.com/mmajurski/lm...
Resources:
arXiv: arxiv.org/abs/2603.04454
Github repo: github.com/mmajurski/lm...
This paper showed up in my Paperzilla "RAG, Retrieval, and Semantic search" feed: paperzilla.ai/digest/d530c...
When RAG systems surface relevant information, LM performance can be enhanced by rewriting the initial query using contextβadded information that, without providing the answer, gives relevant background knowledge and direction.
New paper shows that rewriting a question using retrieved answer-free context into a clearer question before the final LLM call improves accuracy a lot.
In practice, that means you can offload rewrite to a smaller, cheaper model and get better and cheaper results.
Paper + resources π
PS, let me know what preprint/open sources you are interested in, and I can add them
Please have a look at the alternative for Google scholar alerts I've built. It reduces the amount of papers for you to evaluate significantly. Can be consumed as email digests, RSS feed, or via API/MCP. paperzilla.ai
Paper: arxiv.org/abs/2603.014...
Repo: github.com/leonardodali...
Demo: huggingface.co/spaces/AI4Re...
Paperzilla found this paper for me, here's the summary: paperzilla.ai/digest/9cee0...
13 domain experts (PhDs, professors, industry researchers) rated SciDER 4.85/5 for "helpfulness" in reducing workflow and achieving data-grounded accuracy.
Also: An astrophysicist used SciDER to analyze the Kepler Exoplanet Dataset and achieved 98% F1 score on exoplanet detection.
Read on π
SciDER: Scientific Data-centric End-to-end Researcher. Flowchart of expert-based and LLM-based research lifecycle.
This project (paper, demo, and open-source repo) seems like a promising step toward having an AI scientist on your team.
The framework, SciDER, actually does science. Data β experiments β results. End-to-end.
Read on π
#ai4science
Just give it a try: paperzilla.ai and let me know if it is useful for you. If not, let me know as well :)
In Paperzilla, it starts with the user's expression of their research topic. That results in a project that covers categories from multiple sources (e.g. both ChemRxiv and arXiv). The output is a feed that will be consumed by OpenClaw (or another agent). It will work with a narrow relevant context.
paperzilla.ai/news/chemrxi... cc @openclaw-x.bsky.social
ChemRxiv coverage and a Paperzilla skill for OpenClaw
The weekly Paperzilla improvements are here: ChemRxiv coverage in beta and a Paperzilla skill for OpenClaw
Worlds are colliding!
Link to full news item in comment π
arXiv: arxiv.org/abs/2602.233...
Paperzilla found this paper for me and created this summary: paperzilla.ai/digest/df041...
They found that AI agents using simple keyword search tools can answer questions nearly as well as complex, expensive vector databases. So, cheaper, easier to maintain and nearly as good.
Paper & summary in the comments π
Comparison between RAG (red) and agent-based (blue) pipelines for document QnA
A while ago I found a paper that showed that BM25 search sometimes beats RAG. Here is another paper, by @awscloud.bsky.social, showing that agentic keyword search also often is the better choice.
Read on π
Scholar-Skill: 21 Skills Organized by Research Stage
New interesting paper: a scientist at @stonybrooku.bsky.social created scholar-skill, a 21-skill plugin for claude code. He used this to automate the social science pipeline. Pretty cool! Read the paper below π
Paperzilla paper summary: paperzilla.ai/p/01c33cdf/v...
Scholar-Skill: 21 Skills Organized by Research Stage
New interesting paper: a scientist at @stonybrooku.bsky.social created scholar-skill, a 21-skill plugin for claude code. He used this to automate the social science pipeline. Pretty cool! Read the paper below π
Paperzilla paper summary: paperzilla.ai/digest/52a98...
VERO system architecture. Top (orange): example optimization trajectory. Bottom (green): system components. VERO enforces versioning, reproducible execution, and controlled feedback, enabling systematic comparison of optimizers for agent optimization.
Agents that improve agents is a thing.
But how do we know if they actually get better?
A new paper by Scale AI just proposed an evaluation harness to do this.
Still waiting for the repo, but in the meantime let's read it π
Generate an MCP API key in the Paperzilla project portal to get started
Research agents need fresh context to stop hallucinating and to stay up to date.
We just shipped a native MCP server and an llms.txt docs portal so AI research agents can pull high-signal paper feeds without wrapper code. Read more π
Still a very fascinating paper for the #ML crowd. Paperzilla summary: paperzilla.ai/digest/d406e...
ERNIE 5.0 Technical Report with some of its authors
New SOTA for "longest author list"?
Baidu dropped ERNIE 5.0.
The specs are wild (1 Trillion parameters, unified audio/video/text).
But the author list is wilder: **438 people**.
This isn't a research paper anymore. It's a Marvel movie production π
Paperzilla's documentation now has its own portal at docs.paperzilla.ai