Ali Modarressi's Avatar

Ali Modarressi

@amodarressi

PhD student, NLP Researcher at @cislmu.bsky.social | Prev. Intern @Adobe.com

48
Followers
104
Following
13
Posts
02.05.2025
Joined
Posts Following

Latest posts by Ali Modarressi @amodarressi

Preview
Principled Personas: Defining and Measuring the Intended Effects of Persona Prompting on Task Performance Expert persona prompting -- assigning roles such as expert in math to language models -- is widely used for task improvement. However, prior work shows mixed results on its effectiveness, and does not...

๐Ÿ“ข New paper accepted at @eaclmeeting.bsky.social
2026:

Persistent Personas? Role-Playing, Instruction Following, and Safety in Extended Interactions

with
@mhedderich.bsky.social
@amodarressi.bsky.social
Hinrich Schuetze
& Benjamin Roth.

Preprint: arxiv.org/abs/2512.12775

23.01.2026 19:07 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿง‘โ€๐Ÿ”ฌIโ€™m recruiting PhD students in Natural Language Processing @unileipzig.bsky.social Computer Science, together with @scadsai.bsky.social!

Topics include, but arenโ€™t limited to:

๐Ÿ”ŽLinguistic Interpretability
๐ŸŒMultilingual Evaluation
๐Ÿ“–Computational Typology

Please share!

#NLProc #NLP

11.12.2025 13:36 ๐Ÿ‘ 41 ๐Ÿ” 25 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 3
Post image

CIS & MaiNLP Group picture at EMNLP 2025! ๐Ÿคฉ ๐Ÿค— (1/3)

While I sadly ๐Ÿฅฒ won't be at EMNLP this year myself, please do reach out to any of our members for a chat if you are interested in our research!

We also co-organize and participate in some great workshops at EMNLP:

06.11.2025 09:52 ๐Ÿ‘ 13 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Excited to be here in Suzhou for #EMNLP2025!
Iโ€™ll be presenting โ€œImpliRetโ€, check out our poster on Friday Nov. 7th at 14:00.
If youโ€™re into long-context, IR, or just want to chat, come *Pay Ali* a visit ๐Ÿ˜
Link to thread:
x.com/zeinabtaghav...

06.11.2025 02:53 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Details on poster times and locations coming soon.

Would love to meet and chat โ˜•๏ธ๐Ÿ’ฌ

If youโ€™re attending #ACL2025, feel free to stop by and say hi! ๐Ÿ‘‹
๐Ÿงต[4/4]

20.07.2025 22:52 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models Understanding how large language models (LLMs) acquire and store factual knowledge is crucial for enhancing their interpretability and reliability. In this work, we analyze the evolution of factual kn...

โฑ๏ธ๐Ÿ”Ž Time Course MechInterp
We track how factual knowledge forms in OLMo over training by analyzing the evolving roles of Attention Heads and FFNs.
Heads are dynamic and often repurposed; FFNs are stable and keep refining facts.
By: A. Dawar Hakimi
arxiv.org/abs/2506.03434
๐Ÿงต[3/4]

20.07.2025 22:52 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
Amir H. Kargaran on X: "Excited to introduce MEXA, a method for assessing the multilingual capabilities of English-centric LLMs using parallel sentences. It estimates how many languages an LLM covers and at what level. Paper: https://t.co/awRq0Y4SCl Code: https://t.co/M3UVh2F9J1 https://t.co/xBOQ1DJmWx" / X Excited to introduce MEXA, a method for assessing the multilingual capabilities of English-centric LLMs using parallel sentences. It estimates how many languages an LLM covers and at what level. Paper: https://t.co/awRq0Y4SCl Code: https://t.co/M3UVh2F9J1 https://t.co/xBOQ1DJmWx

๐ŸŒ MEXA: Multilingual Evaluation of English-Centric LLMs

A method for assessing the multilingual capabilities of English-centric LLMs using parallel sentences. It estimates how many languages an LLM covers and at what level.

By: @kargaranamir.bsky.social

x.com/amir_nlp/sta...
๐Ÿงต[2/4]

20.07.2025 22:52 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Leaving Vancouver after ICMLโ€™s closing fireworks ๐Ÿ˜๐ŸŽ†

Heading to Toronto for a few days, then off to
@aclmeeting.bsky.social to present:

"Collapse of Dense Retrievers"
A work by @mohsen-fayyaz.bsky.social that I was fortunate to collaborate on.

Also co-presenting two other papersโ€ฆ๐Ÿงต [1/4]

20.07.2025 22:52 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Preview
Ali Modarressi on X: "๐Ÿš€ Introducing NoLiMa Paper ๐Ÿš€ Most long-context benchmarks have literal overlaps between the questions and the contextโ€”but what if they didnโ€™t? ๐Ÿค” Turns out, itโ€™s a tough challenge! Powerful models like GPT-4o performance drops from 99.3% to 69.7% at 32K context length. ๐Ÿ“‰ https://t.co/Fo3YsGCBsi" / X ๐Ÿš€ Introducing NoLiMa Paper ๐Ÿš€ Most long-context benchmarks have literal overlaps between the questions and the contextโ€”but what if they didnโ€™t? ๐Ÿค” Turns out, itโ€™s a tough challenge! Powerful models like GPT-4o performance drops from 99.3% to 69.7% at 32K context length. ๐Ÿ“‰ https://t.co/Fo3YsGCBsi

Full NoLiMa post thread (X / Twitter): x.com/AModarressi/...

09.07.2025 13:53 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
NoLiMa: Long-Context Evaluation Beyond Literal Matching Recent large language models (LLMs) support long contexts ranging from 128K to 1M tokens. A popular method for evaluating these capabilities is the needle-in-a-haystack (NIAH) test, which involves ret...

Check out the paper & our GitHub repo (with results on recent models ๐Ÿ†•โœจ)!
๐Ÿ“„: arxiv.org/abs/2502.05167
๐Ÿ”—: github.com/adobe-resear...
๐Ÿค—: huggingface.co/datasets/amo...
This work was my internship project at
@adobe.com, in collaboration with my mentors there and Hinrich Schรผtze.

09.07.2025 13:53 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

Iโ€™ll be at @icmlconf.bsky.social next week presenting NoLiMa!
Poster on Tue July 15, 4:30โ€“7pm (E-2312).

Happy to grab a coffee and chat about long-context, memory, research, or just to catch up.

Iโ€™ll be in Toronto for a couple of days after the conference, let me know if youโ€™re around!

09.07.2025 13:53 ๐Ÿ‘ 4 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

MemLLM: Finetuning LLMs to Use Explicit Read-Write Memory

Ali Modarressi, Abdullatif Kรถksal, Ayyoob Imani, Mohsen Fayyaz, Hinrich Schuetze

Action editor: Greg Durrett

https://openreview.net/forum?id=dghM7sOudh

#memory #memorizing #memllm

19.05.2025 00:07 ๐Ÿ‘ 2 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence Dense retrieval models are commonly used in Information Retrieval (IR) applications, such as Retrieval-Augmented Generation (RAG). Since they often serve as the first step in these systems, their robu...

The takeaway? we need robust retrievers that prioritize answer relevance, not just heuristic shortcuts.

work with an amazing team:
@mohsen-fayyaz.bsky.social,
Hinrich Schรผtze,
@violetpeng.bsky.social

paper: arxiv.org/abs/2503.05037
dataset ๐Ÿค—: t.co/QZFyCLqP0P

Cross-post from x.com/mohsen_fayyaz

17.05.2025 20:28 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

We also analyze RAG: biased retrievers can mislead LLMs, degrading their performance by 34%, worse than retrieving nothing! ๐Ÿ˜ฎ

17.05.2025 20:28 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

When multiple biases combine, retrievers fail catastrophically:
๐Ÿ“‰ Answer-containing docs ranked <3% of the time over a synthetic biased doc with no answer!

17.05.2025 20:28 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Dense retrievers are crucial for RAG and search, but do they actually retrieve useful evidence? ๐Ÿค”
We design controlled experiments by repurposing a relation extraction dataset, exposing serious flaws in models like Dragon+ and Contriever.

17.05.2025 20:28 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ“„ Collapse of Dense Retrievers

Accepted to #ACL2025 main conference ๐ŸŽ‰๐ŸŽ‰

In this paper we uncover major vulnerabilities in dense retrievers like Contriever, showing they favor:
๐Ÿ“Œ Shorter docs
๐Ÿ“Œ Early positions
๐Ÿ“Œ Repeated entities
๐Ÿ“Œ Literal matches
...all while ignoring the answer's presence!

17.05.2025 20:28 ๐Ÿ‘ 9 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1