Paolo Papotti (@papotti)

We are reopening the interviews for this PhD position. Please help me spread the word to find the right potential candidates!

17.02.2026 16:02 👍 1 🔁 0 💬 0 📌 0

main architecture

We introduce
- Query planning as constrained optimization over quality constraints and cost objective
- Gradient-based optimization to jointly choose operators and allocate error budgets across pipelines
- KV-cache–based operators to turn discrete physical choices into a runtime-quality continuum

11.02.2026 07:45 👍 0 🔁 0 💬 0 📌 0

Co-authors: Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig

This is the first outcome of our collaboration with Technische Universität Darmstadt within the @agencerecherche.bsky.social / @dfg.de ANR/DFG #Magiq project - more to come!

11.02.2026 07:45 👍 0 🔁 0 💬 1 📌 0

plots of results

Empirically, Stretto delivers 2x-10x faster execution 🔥 across various datasets and queries compared to prior systems that meet quality guarantees.

11.02.2026 07:45 👍 0 🔁 0 💬 1 📌 0

Stretto paper on arxiv

🚀 New: The Stretto Execution Engine for LLM-Augmented Data Systems.
LLM operators create a runtime ↔ accuracy trade-off in query execution. We address it with a novel optimizer, for end-to-end quality guarantees, and new KV-cache–based operators, for efficiency.
arxiv.org/abs/2602.04430
Details👇

11.02.2026 07:45 👍 3 🔁 0 💬 1 📌 0

Happy Fontaines D.C.'s fan from the last album (2024). But the real treat was discovering the previous ones!

01.02.2026 20:22 👍 1 🔁 0 💬 0 📌 1

I d also like to test it, thanks!

23.01.2026 07:46 👍 0 🔁 0 💬 0 📌 0

I agree. Here is another trick for input context we recently published
bsky.app/profile/papo...

19.01.2026 12:20 👍 1 🔁 0 💬 0 📌 0

These results point toward models that decide which retrieved document to trust, turning “context engineering” from a static prompt recipe into a dynamic decoding policy.
Amazing work from Giulio Corallo in his industrial PhD at SAP!

15.01.2026 07:36 👍 0 🔁 0 💬 0 📌 0

Key insight: 𝐄𝐯𝐢𝐝𝐞𝐧𝐜𝐞 𝐚𝐠𝐠𝐫𝐞𝐠𝐚𝐭𝐢𝐨𝐧 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐚𝐭 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 𝐭𝐢𝐦𝐞, the model can effectively “switch” which document drives each token - without cross-document attention!

15.01.2026 07:36 👍 0 🔁 0 💬 1 📌 0

📈 Results: PCED often matches (and sometimes beats) long-context concatenation, while dramatically outperforming KV merge baseline on multi-doc QA/ICL.
🚀 Systems win: ~180× faster time-to-first-token vs long-context prefill using continuous batching and Paged Attention.

15.01.2026 07:36 👍 0 🔁 0 💬 1 📌 0

Instead of concatenating docs into one context (slow, noisy attention), training-free PCED:
● Keeps each document as its own 𝐞𝐱𝐩𝐞𝐫𝐭 with independent KV cache
● Runs experts in 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥 to get logits
● Selects next token with a 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥-𝐚𝐰𝐚𝐫𝐞 𝐜𝐨𝐧𝐭𝐫𝐚𝐬𝐭𝐢𝐯𝐞 𝐝𝐞𝐜𝐨𝐝𝐢𝐧𝐠 rule integrating scores as a prior

15.01.2026 07:36 👍 0 🔁 0 💬 1 📌 0

Parallel Context-of-Experts Decoding for Retrieval Augmented Generation Retrieval Augmented Generation faces a trade-off: concatenating documents in a long prompt enables multi-document reasoning but creates prefill bottlenecks, while encoding document KV caches separatel...

🛑 𝐒𝐭𝐨𝐩 𝐭𝐡𝐫𝐨𝐰𝐢𝐧𝐠 𝐚𝐰𝐚𝐲 𝐲𝐨𝐮𝐫 𝐫𝐞𝐭𝐫𝐢𝐞𝐯𝐚𝐥 𝐬𝐜𝐨𝐫𝐞𝐬.
RAG uses embedding scores to pick Top-K, then treat all retrieved chunks as equal.
Parallel Context-of-Experts Decoding (PCED) uses retrieval scores to move evidence aggregation from attention to decoding.
🚀 180× faster time-to-first-token!

15.01.2026 07:36 👍 4 🔁 1 💬 1 📌 1

AILY LABS hiring PhD position (start: early 2026): Tool-Augmented LLMs for Enterprise Data AI in Barcelona, Catalonia, Spain | LinkedIn Posted 11:18:12 AM. MissionPhD position (start: early 2026): Tool-Augmented LLMs for Enterprise Data AIIndustry hire at…See this and similar jobs on LinkedIn.

New PhD position on Tool-Augmented LLMs for Enterprise Data AI 🚨
Starting in early 2026 under my academic supervision and hosted by the fantastic team at AILY LABS in Madrid or Barcelona

Details reported in the link - please ping me for any question!
www.linkedin.com/jobs/view/43...

21.11.2025 07:38 👍 1 🔁 1 💬 0 📌 1

Thumbnail: Accelerating Tabular Inference: Training Data Generation with TENET

Vol:18 No:12 → Accelerating Tabular Inference: Training Data Generation with TENET
👥 Authors: Enzo Veltri, Donatello Santoro, Jean-Flavien Bussotti, Paolo Papotti
📄 PDF: https://www.vldb.org/pvldb/vol18/p5303-veltri.pdf

04.09.2025 04:00 👍 2 🔁 2 💬 0 📌 0

Can We Trust the Judges? This is the question we asked in validating factuality evaluation methods via answer perturbation. Check out the results at the #EvalLLM2025 workshop at #TALN2025
Blog: giovannigatti.github.io/trutheval/
Watch: www.youtube.com/watch?v=f0XJ...
Play: github.com/GiovanniGatt...

30.06.2025 12:55 👍 3 🔁 1 💬 0 📌 0

Kudos to my amazing co-authors Dario Satriani, Enzo Veltri, Donatello Santoro! Another great collaboration between Università degli Studi della Basilicata and EURECOM 🙌

#LLM #Factuality #Benchmark #RelationalFactQA #NLP #AI

02.06.2025 14:51 👍 2 🔁 0 💬 0 📌 0

Structured outputs power analytics, reporting, and tool-augmented agents. This work exposes where current LLMs fall short and offers a clear tool for measuring progress on factuality beyond single-value QA. 📊

02.06.2025 14:51 👍 1 🔁 0 💬 1 📌 0

We release a new factuality benchmark with 696 annotated natural-language questions paired with gold factual answers expressed as tables (avg. 27 rows × 5 attributes), spanning 9 knowledge domains, with controlled question complexity and rich metadata.

02.06.2025 14:51 👍 0 🔁 0 💬 1 📌 0

Our new paper, "RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models", measures exactly this gap.

Wider or longer output tables = tougher for all LLMs! 🧨
From Llama 3 and Qwen to GPT-4, no LLM goes above 25% accuracy on our stricter measure.

02.06.2025 14:51 👍 0 🔁 0 💬 1 📌 0

RelationalFactQA: A Benchmark for Evaluating Tabular Fact Retrieval from Large Language Models Factuality in Large Language Models (LLMs) is a persistent challenge. Current benchmarks often assess short factual answers, overlooking the critical ability to generate structured, multi-record tabul...

Ask any LLM for a single fact and it’s usually fine.
Ask it for a rich list and the same fact is suddenly missing or hallucinated because the output context got longer 😳

LLMs exceed 80% accuracy on single-value questions but accuracy drops linearly with the # of output facts

New paper, details 👇

02.06.2025 14:51 👍 8 🔁 0 💬 1 📌 0

and a special thanks to
@tanmoy-chak.bsky.social for leading this effort!

01.06.2025 08:43 👍 5 🔁 1 💬 0 📌 0

More co-authors here on bsky
@iaugenstein.bsky.social
@preslavnakov.bsky.social
@igurevych.bsky.social
@emilioferrara.bsky.social
@fil.bsky.social
@giovannizagni.bsky.social
@dcorney.com
@mbakker.bsky.social
@computermacgyver.bsky.social
@irenelarraz.bsky.social
@gretawarren.bsky.social

01.06.2025 08:43 👍 4 🔁 1 💬 1 📌 0

It’s time we rethink how "facts" are negotiated in the age of platforms.

Excited to hear your thoughts!
#Misinformation #FactChecking #SocialMedia #Epistemology #HCI #DigitalTruth #CommunityNotes

arxiv.org/pdf/2505.20067

01.06.2025 07:48 👍 6 🔁 0 💬 1 📌 0

Community-based moderation offers speed & scale, but also raises tough questions:
– Can crowds overcome bias?
– What counts as evidence?
– Who holds epistemic authority?

Our interdisciplinary analysis combines perspectives from HCI, media studies, & digital governance.

01.06.2025 07:48 👍 2 🔁 1 💬 1 📌 0

Platforms like X are outsourcing fact-checking to users via tools like Community Notes. But what does this mean for truth online?

We argue this isn’t just a technical shift — it’s an epistemological transformation. Who gets to define what's true when everyone is the fact-checker?

01.06.2025 07:48 👍 9 🔁 4 💬 1 📌 0

🚨 𝐖𝐡𝐚𝐭 𝐡𝐚𝐩𝐩𝐞𝐧𝐬 𝐰𝐡𝐞𝐧 𝐭𝐡𝐞 𝐜𝐫𝐨𝐰𝐝 𝐛𝐞𝐜𝐨𝐦𝐞𝐬 𝐭𝐡𝐞 𝐟𝐚𝐜𝐭-𝐜𝐡𝐞𝐜𝐤𝐞𝐫?
new "Community Moderation and the New Epistemology of Fact Checking on Social Media"

with I Augenstein, M Bakker, T. Chakraborty, D. Corney, E
Ferrara, I Gurevych, S Hale, E Hovy, H Ji, I Larraz, F
Menczer, P Nakov, D Sahnan, G Warren, G Zagni

01.06.2025 07:48 👍 16 🔁 8 💬 1 📌 0

🌟 New paper alert! 🌟
Our paper, "Retrieve, Merge, Predict: Augmenting Tables with Data Lakes", has been published in TMLR!
In this work, we created YADL (a semi-synthetic data lake), and we benchmarked methods for augmenting user-provided tables given information found in data lakes.
1/

19.05.2025 15:43 👍 6 🔁 3 💬 2 📌 2

Thanks for the amazing work to the whole team!

Joint work between Università degli Studi della Basilicata (Enzo Veltri, Donatello Santoro, Dario Satriani) and EURECOM (Sara Rosato, Simone Varriale).

#SQL #DataManagement #QueryOptimization #AI #LLM #Databases #SIGMOD2025

05.05.2025 18:03 👍 1 🔁 0 💬 0 📌 0

GitHub - dbunibas/galois: Galois Galois. Contribute to dbunibas/galois development by creating an account on GitHub.

The principles in Galois – optimizing for quality alongside cost & dynamically acquiring optimization metadata – are a promising starting point for building robust and effective declarative data systems over LLMs. 💡

Paper and code: github.com/dbunibas/gal...

05.05.2025 18:03 👍 1 🔁 0 💬 1 📌 0

Paolo Papotti

Latest posts by Paolo Papotti @papotti