Mark Pors πŸ¦–'s Avatar

Mark Pors πŸ¦–

@pors

AI engineer. Previously co-founder and CTO at WatchMouse. Building https://paperzilla.ai

56
Followers
280
Following
88
Posts
07.11.2023
Joined
Posts Following

Latest posts by Mark Pors πŸ¦– @pors

Preview
Query Disambiguation via Answer-Free Context: Doubling Performance on Humanity's Last Exam How carefully and unambiguously a question is phrased has a profound impact on the quality of the response, for Language Models (LMs) as well as people. While model capabilities continue to advance, t...

Resources:

arXiv: arxiv.org/abs/2603.04454

Github repo: github.com/mmajurski/lm...

08.03.2026 18:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

This paper showed up in my Paperzilla "RAG, Retrieval, and Semantic search" feed: paperzilla.ai/digest/d530c...

08.03.2026 18:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
When RAG systems surface relevant information, LM performance can be enhanced by rewriting the initial query using contextβ€”added information that, without providing the answer, gives relevant background
knowledge and direction.

When RAG systems surface relevant information, LM performance can be enhanced by rewriting the initial query using contextβ€”added information that, without providing the answer, gives relevant background knowledge and direction.

New paper shows that rewriting a question using retrieved answer-free context into a clearer question before the final LLM call improves accuracy a lot.

In practice, that means you can offload rewrite to a smaller, cheaper model and get better and cheaper results.

Paper + resources πŸ‘‡

08.03.2026 18:03 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

PS, let me know what preprint/open sources you are interested in, and I can add them

08.03.2026 10:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Please have a look at the alternative for Google scholar alerts I've built. It reduces the amount of papers for you to evaluate significantly. Can be consumed as email digests, RSS feed, or via API/MCP. paperzilla.ai

08.03.2026 10:05 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
SciDER: Scientific Data-centric End-to-end Researcher Automated scientific discovery with large language models is transforming the research lifecycle from ideation to experimentation, yet existing agents struggle to autonomously process raw data collect...

Paper: arxiv.org/abs/2603.014...
Repo: github.com/leonardodali...
Demo: huggingface.co/spaces/AI4Re...

05.03.2026 12:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Paperzilla found this paper for me, here's the summary: paperzilla.ai/digest/9cee0...

05.03.2026 12:57 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

13 domain experts (PhDs, professors, industry researchers) rated SciDER 4.85/5 for "helpfulness" in reducing workflow and achieving data-grounded accuracy.

Also: An astrophysicist used SciDER to analyze the Kepler Exoplanet Dataset and achieved 98% F1 score on exoplanet detection.

Read on πŸ‘‡

05.03.2026 12:55 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
SciDER: Scientific Data-centric End-to-end Researcher.  Flowchart of expert-based and LLM-based research lifecycle.

SciDER: Scientific Data-centric End-to-end Researcher. Flowchart of expert-based and LLM-based research lifecycle.

This project (paper, demo, and open-source repo) seems like a promising step toward having an AI scientist on your team.

The framework, SciDER, actually does science. Data β†’ experiments β†’ results. End-to-end.

Read on πŸ‘‡

#ai4science

05.03.2026 12:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

Just give it a try: paperzilla.ai and let me know if it is useful for you. If not, let me know as well :)

04.03.2026 19:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

In Paperzilla, it starts with the user's expression of their research topic. That results in a project that covers categories from multiple sources (e.g. both ChemRxiv and arXiv). The output is a feed that will be consumed by OpenClaw (or another agent). It will work with a narrow relevant context.

04.03.2026 19:46 πŸ‘ 1 πŸ” 0 πŸ’¬ 4 πŸ“Œ 0

paperzilla.ai/news/chemrxi... cc @openclaw-x.bsky.social

04.03.2026 19:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
ChemRxiv coverage and a Paperzilla skill for OpenClaw

ChemRxiv coverage and a Paperzilla skill for OpenClaw

The weekly Paperzilla improvements are here: ChemRxiv coverage in beta and a Paperzilla skill for OpenClaw

Worlds are colliding!

Link to full news item in comment πŸ‘‡

04.03.2026 19:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 3 πŸ“Œ 0

arXiv: arxiv.org/abs/2602.233...

03.03.2026 09:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Paperzilla found this paper for me and created this summary: paperzilla.ai/digest/df041...

03.03.2026 09:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

They found that AI agents using simple keyword search tools can answer questions nearly as well as complex, expensive vector databases. So, cheaper, easier to maintain and nearly as good.

Paper & summary in the comments πŸ‘‡

03.03.2026 09:22 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Comparison between RAG (red) and agent-based (blue) pipelines for document QnA

Comparison between RAG (red) and agent-based (blue) pipelines for document QnA

A while ago I found a paper that showed that BM25 search sometimes beats RAG. Here is another paper, by @awscloud.bsky.social, showing that agentic keyword search also often is the better choice.

Read on πŸ‘‡

03.03.2026 09:22 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
 Scholar-Skill: 21 Skills Organized by Research Stage

Scholar-Skill: 21 Skills Organized by Research Stage

New interesting paper: a scientist at @stonybrooku.bsky.social created scholar-skill, a 21-skill plugin for claude code. He used this to automate the social science pipeline. Pretty cool! Read the paper below πŸ‘‡

01.03.2026 08:52 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Preview
Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace or Augment Social Scientists? AI agents -- systems that execute multi-step reasoning workflows with persistent state, tool access, and specialist skills -- represent a qualitative shift from prior automation technologies in social...

arXiv link: arxiv.org/abs/2602.224...

01.03.2026 08:54 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Paperzilla paper summary: paperzilla.ai/p/01c33cdf/v...

01.03.2026 08:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
 Scholar-Skill: 21 Skills Organized by Research Stage

Scholar-Skill: 21 Skills Organized by Research Stage

New interesting paper: a scientist at @stonybrooku.bsky.social created scholar-skill, a 21-skill plugin for claude code. He used this to automate the social science pipeline. Pretty cool! Read the paper below πŸ‘‡

01.03.2026 08:52 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Preview
VeRO: An Evaluation Harness for Agents to Optimize Agents An important emerging application of coding agents is agent optimization: the iterative improvement of a target agent through edit-execute-evaluate cycles. Despite its relevance, the community lacks a...

arXiv: arxiv.org/abs/2602.22480

27.02.2026 17:42 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Paperzilla - High-signal research paper feeds. Never miss what matters. High-signal research paper feeds from arXiv, medRxiv, bioRxiv, and ChinaXiv. Smart feeds and email digests that surface fewer, better papers.

Paperzilla paper summary: paperzilla.ai/digest/52a98...

27.02.2026 17:42 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
VERO system architecture. Top (orange): example optimization trajectory. Bottom (green): system components. VERO enforces versioning, reproducible execution, and controlled feedback, enabling systematic
comparison of optimizers for agent optimization.

VERO system architecture. Top (orange): example optimization trajectory. Bottom (green): system components. VERO enforces versioning, reproducible execution, and controlled feedback, enabling systematic comparison of optimizers for agent optimization.

Agents that improve agents is a thing.

But how do we know if they actually get better?

A new paper by Scale AI just proposed an evaluation harness to do this.

Still waiting for the repo, but in the meantime let's read it πŸ‘‡

27.02.2026 17:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Paperzilla - High-signal research paper feeds. Never miss what matters. High-signal research paper feeds from arXiv, medRxiv, bioRxiv, and ChinaXiv. Smart feeds and email digests that surface fewer, better papers.

Full post paperzilla.ai/news/mcp-ser...

25.02.2026 07:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Generate an MCP API key in the Paperzilla project portal to get started

Generate an MCP API key in the Paperzilla project portal to get started

Research agents need fresh context to stop hallucinating and to stay up to date.

We just shipped a native MCP server and an llms.txt docs portal so AI research agents can pull high-signal paper feeds without wrapper code. Read more πŸ‘‡

25.02.2026 07:52 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
ERNIE 5.0 Technical Report In this report, we introduce ERNIE 5.0, a natively autoregressive foundation model desinged for unified multimodal understanding and generation across text, image, video, and audio. All modalities are...

Source arxiv.org/abs/2602.04705

24.02.2026 11:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Paperzilla - High-signal research paper feeds. Never miss what matters. High-signal research paper feeds from arXiv, medRxiv, bioRxiv, and ChinaXiv. Stay current with weekly briefings and preprint alerts - fewer, better papers.

Still a very fascinating paper for the #ML crowd. Paperzilla summary: paperzilla.ai/digest/d406e...

24.02.2026 11:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
ERNIE 5.0 Technical Report with some of its authors

ERNIE 5.0 Technical Report with some of its authors

New SOTA for "longest author list"?

Baidu dropped ERNIE 5.0.

The specs are wild (1 Trillion parameters, unified audio/video/text).

But the author list is wilder: **438 people**.

This isn't a research paper anymore. It's a Marvel movie production πŸ˜€

24.02.2026 11:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Paperzilla Docs Portal Official Paperzilla docs portal for projects, feeds, CLI, agent workflows, and integrations.

Paperzilla's documentation now has its own portal at docs.paperzilla.ai

23.02.2026 08:14 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0