Shane Storks's Avatar

Shane Storks

@shanestorks

NLP researcher and postdoc at University of Michigan Weinberg Institute for Cognitive Science. Grounded language understanding and reasoning. (he/him) πŸ³οΈβ€πŸŒˆ

80
Followers
197
Following
9
Posts
20.11.2024
Joined
Posts Following

Latest posts by Shane Storks @shanestorks

Hello #NLProc #ACL2026NLP people. I am looking for **two emergency reviewers** in the Safety and Alignment in LLMs track for ACL/ARR.

Reviews are due Feb 15th. Please DM if interested and available.

Happy to offer drinks/food if you live in/pass by Lisbon β˜€οΈ

10.02.2026 14:59 πŸ‘ 6 πŸ” 10 πŸ’¬ 0 πŸ“Œ 0

Seems to be a common situation for ACs this round, but I'm also looking for two emergency reviewers for the January #ARR Evaluation and Resources track. I'd appreciate any help (reposts, encouragement, black magic...)

10.02.2026 11:15 πŸ‘ 3 πŸ” 6 πŸ’¬ 0 πŸ“Œ 0

I'm looking for two emergency reviewers πŸ§‘β€πŸš’πŸ‘©β€πŸš’ for the ARR January Generalizability and Transfer track.

Please reach out if you have time & qualify for review or RT for visibilityπŸ™πŸ™

10.02.2026 11:43 πŸ‘ 2 πŸ” 6 πŸ’¬ 0 πŸ“Œ 0

I could use an emergency reviewer for an ACL submission involving interpretability and syntax. Please DM me if you might be able to provide an emergency review before February 15!

10.02.2026 07:38 πŸ‘ 4 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0

Looking for emergency reviewers for ARR Special Track "Explainability of NLP Models". Topics: Faithfulness, mechanistic interpretability, surveys and position papers. Deadline Feb 14 AoE. #ACL2026NLP

09.02.2026 17:33 πŸ‘ 8 πŸ” 7 πŸ’¬ 1 πŸ“Œ 1

I am looking for 2 emergency reviewers for the ARR Ethics, Bias & Fairness track. Please DM me if you are available πŸ™

10.02.2026 09:27 πŸ‘ 6 πŸ” 6 πŸ’¬ 0 πŸ“Œ 0

Hello #NLProc #ACL2026NLP community, I'm looking for an emergency reviewer for an ARR submission on LLM interpretability.

If you're available to complete a review before Feb 15, please reply or DM πŸ™

10.02.2026 14:41 πŸ‘ 2 πŸ” 6 πŸ’¬ 0 πŸ“Œ 0

This work finally has a home! Looking forward to presenting β€œTransparent and Coherent Procedural Mistake Detection” at #EMNLP2025 🀩

20.08.2025 22:41 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Screenshot of the Ai2 Paper Finder interface

Screenshot of the Ai2 Paper Finder interface

Meet Ai2 Paper Finder, an LLM-powered literature search system.

Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow β€” and helps researchers find more papers than ever πŸ”

26.03.2025 19:07 πŸ‘ 117 πŸ” 23 πŸ’¬ 6 πŸ“Œ 9
Call for Main Conference Papers Official website for the 2025 Conference on Empirical Methods in Natural Language Processing

The EMNLP 2025 conference website and CfP are now live! 2025.emnlp.org/calls/main-c...

Conference dates: November 5-9 in Suzhou, China

Submissions will be through ARR, and this year's theme is Interdisciplinary Recontextualization of NLP

14.02.2025 16:25 πŸ‘ 25 πŸ” 7 πŸ’¬ 0 πŸ“Œ 2

Our workshop has been extended till Feb 20. We are looking forward for your papers at NAACL's Queer in AI workshop.

03.02.2025 14:10 πŸ‘ 17 πŸ” 18 πŸ’¬ 0 πŸ“Œ 1

One of ways in which AI hype men are highly copacetic with Trump is that they think you can assert things with absolutely no care for truth or feasibility. Bullshitters par excellence

24.01.2025 15:13 πŸ‘ 50 πŸ” 7 πŸ’¬ 3 πŸ“Œ 0
Coherent Physical Commonsense Reasoning in Foundational Language Models

Some happy news: my dissertation on "Coherent Physical Commonsense Reasoning in Foundational Language Models" is finally available online! πŸŽ“https://deepblue.lib.umich.edu/handle/2027.42/196025

21.01.2025 20:24 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Adding more details. Space is (very) limited. Please contact me by next Wednesday 1/15/2025 for full consideration. Proposal doesn’t have to be formal.

10.01.2025 13:23 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

πŸ“£ UMich undergraduate/master students: are you interested in research at the intersection of LLMs and cognitive science, but need guidance and computing resources? I want to work with you!

If interested, DM/email me with your CV and a brief project proposal!

08.01.2025 19:34 πŸ‘ 4 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Shane Storks wearing academic regalia after his doctoral hooding ceremony.

Shane Storks wearing academic regalia after his doctoral hooding ceremony.

So happy to finally share this last piece of my dissertation (and my first post on Bluesky)!

Obligatory photo after my recent hooding attached πŸ§‘β€πŸŽ“

17.12.2024 13:59 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Compared to vanilla VLMs, our interventions improve the accuracy of mistake detection and the relevance, coherence, and efficiency of explanations.

We also show that patterns in metrics can indicate common issues in VLMs, such as visual hallucination! πŸ˜΅β€πŸ’«

17.12.2024 13:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

In this work, we expand the recently studied problem of procedural mistake detection in images to require explanations through self-Q&A. πŸ‘β€πŸ—¨πŸ€–πŸ’¬

We define automated metrics for explanation coherence, and incorporate them into VLMs with various inference and fine-tuning methods.

17.12.2024 13:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Dialog between a foundational VLM and itself to detect the incomplete state of the procedure "Unclip the pegs on the cloth" in an image showing a cloth pegged to a clothing line. The VLM generates the following questions and answers: 1. "Is there a cloth in the image? Yes", 2. "Are there pegs on the cloth? Yes", and 3. "Is there someone holding pegs? No". As the VLM asks these questions it becomes more confident that the procedure has not been successfully completed.

Dialog between a foundational VLM and itself to detect the incomplete state of the procedure "Unclip the pegs on the cloth" in an image showing a cloth pegged to a clothing line. The VLM generates the following questions and answers: 1. "Is there a cloth in the image? Yes", 2. "Are there pegs on the cloth? Yes", and 3. "Is there someone holding pegs? No". As the VLM asks these questions it becomes more confident that the procedure has not been successfully completed.

How well can VLMs detect and explain humans' procedural mistakes, like in cooking or assembly?
πŸ§‘β€πŸ³πŸ§‘β€πŸ”§

My new pre-print with Itamar Bar-Yossef, Yayuan Li, Zheyuan Zhang, Jason J. Corso, and Joyce Chai dives into this!

arxiv.org/pdf/2412.11927

17.12.2024 13:59 πŸ‘ 3 πŸ” 0 πŸ’¬ 1 πŸ“Œ 1