Yoav Artzi (@yoavartzi.com)

This call is still open. I am looking to recruit, as well as many other faculty at Cornell. We review folders as they come, and will send offers until all positions are filled.

Please share with your network 🙏

17.02.2026 04:50 👍 11 🔁 8 💬 0 📌 0

About

We have a mailing list for big announcements:
groups.google.com/g/colm-annou...

We use it very sparingly, roughly 1-2 times a year

19.01.2026 17:15 👍 2 🔁 1 💬 0 📌 0

Call for papers -- due March 31, 2026 (abstracts due March 26)
colmweb.org/cfp.html

Call for workshops -- due April 14, 2026
colmweb.org/cfw.html

18.01.2026 22:18 👍 17 🔁 7 💬 0 📌 0

Hence, this is an interesting and important benchmark. Through a simple environment, it exposes a fairly fundamental flaw in current models

29.12.2025 02:21 👍 1 🔁 0 💬 1 📌 0

This is not surprising, and aligns with other findings in the literature regarding visual reasoning and manipulation

29.12.2025 02:20 👍 0 🔁 0 💬 1 📌 0

The prompts do provide rudimentary illustration. The stateful version allows the model to see the outcome of its own actions, technically allowing it to infer the physics. Generally though, the result for LLMs out of the box is negative.

29.12.2025 02:20 👍 1 🔁 0 💬 1 📌 0

Most of the experiments are not with VLMs, but with a diverse set of RL methods.

Do LLMs understand physics? They definitely generate outputs that seem to indicate so.

29.12.2025 02:20 👍 0 🔁 0 💬 1 📌 0

Submit to COLM! Deadline of March 31. This llama gets to enjoy his holidays and isn't stressed out just yet...

16.12.2025 15:36 👍 8 🔁 1 💬 0 📌 0

Zoe presented this paper at NeurIPS D+B: it's all knots(🪢🪢🪢!?), no language tokens were harmed (or reinforced) in the process

It's such a fun and creative paper, a real mind twist ;)

You really get to think carefully about visual intelligence looking at these knots 🪢

13.12.2025 15:33 👍 7 🔁 0 💬 1 📌 0

Hi all, I will be at #NeurIPS2025 to present my work on stress-testing looooooong visual reasoning with KnotGym🥨
Let's talk, whether or not your VLM that can see 14 million possible futures like Doctor Strange

28.11.2025 16:08 👍 1 🔁 1 💬 0 📌 0

COLM is going to San Francisco for 2026!

🗓️Dates: October 6-9, 2026
🏨Venue: Hilton San Francisco Union Square

Website and CFPs for papers and workshops coming up soon!

11.11.2025 19:30 👍 21 🔁 6 💬 0 📌 1

10.11.2025 20:47 👍 6 🔁 2 💬 0 📌 1

This is maybe counterintuitive to the original intention of just index the chaos to make it accessible. I guess that ideal of search softened a long time ago

10.11.2025 15:57 👍 0 🔁 0 💬 0 📌 0

That's definitely part of it, because this digestions has deeper history. Search engine indexing also seems just easier, so companies opt to it, even pre AI-overview-everything

10.11.2025 15:57 👍 0 🔁 0 💬 1 📌 0

Re peer-rev --> pre-print servers: arXiv is a simple uniform place to store. Indexing engines love it, so if you want something to be searchable, nothing is better. To make things worse, at times it seems like journals/proceedings almost play a game of hide-and-seek with PDFs

10.11.2025 15:50 👍 0 🔁 0 💬 1 📌 0

Re position papers: I don't think anyone can deny how effective some of these papers became for citations counts

10.11.2025 15:50 👍 0 🔁 0 💬 1 📌 0

Is this all just a big practical joke for ChatGPT? I have been told god doesn't play dice with the world, but I guess AGI does :)

06.11.2025 20:52 👍 1 🔁 0 💬 0 📌 0

It's a Thursday though ....

05.11.2025 14:17 👍 4 🔁 0 💬 1 📌 0

LM-class LM-class is an education resource for contemporary language modeling, broadly construed.

All available here:
lm-class.org

ChangeLog here:
lm-class.org/CHANGELOG.md

03.11.2025 15:54 👍 2 🔁 0 💬 0 📌 0

Pushed a big update to LM-class (v2025.2) -- this second version makes a much more mature resource

Many refinements of lecture slides + significant improvements to the assignments

Many thanks to @ch272h.bsky.social, Yilun Hua, and Shankar Padmanabhan for their work on the assignments

03.11.2025 15:54 👍 4 🔁 0 💬 1 📌 0

Post-training for Efficient Communication via Convention Formation Humans communicate with increasing efficiency in multi-turn interactions, by adapting their language and forming ad-hoc conventions. In contrast, prior work shows that LLMs do not naturally show this ...

This kind of ad-hoc adaptation is hard in general of LLMs, but you can post-train to it for some degree
arxiv.org/abs/2508.06482

I suspect contemporary ASR models have the same backbone, so maybe applicable too

More broadly, there is a lot of interesting stuff to do in this space of adaptation

03.11.2025 15:50 👍 2 🔁 0 💬 1 📌 0

I am potentially recruiting a postdoctoral fellow through this program. If interested, name me as a mentor, and ping me to let me know that you are applying! The process includes some sort of interview, so I can try to squeeze a few of these in advance (it will help a lot!)

28.10.2025 18:46 👍 4 🔁 0 💬 0 📌 0

Cornell is recruiting for multiple postdoctoral positions in AI as part of two programs: Empire AI Fellows and Foundational AI Fellows. Positions are available in NYC and Ithaca.

Deadline for full consideration is Nov 20, 2025!
academicjobsonline.org/ajo/jobs/30971

28.10.2025 18:45 👍 3 🔁 1 💬 0 📌 2

Cornell University, Empire AI Fellows Program Job #AJO30971, Postdoctoral Fellow, Empire AI Fellows Program, Cornell University, New York, New York, US

Cornell (NYC and Ithaca) is recruiting AI postdocs, apply by Nov 20, 2025! If you're interested in working with me on technical approaches to responsible AI (e.g., personalization, fairness), please email me.

academicjobsonline.org/ajo/jobs/30971

28.10.2025 18:19 👍 32 🔁 20 💬 1 📌 2

Wild

28.10.2025 13:51 👍 0 🔁 0 💬 0 📌 0

There's the legit gaming, which is just optimizing for the metrics and breaking them. Then there's the really fake stuff, like citation rings. You would thing citation translate to bitcoins with the level of creativity and effort that people put into it

27.10.2025 18:41 👍 2 🔁 0 💬 1 📌 0

The top citer has >1k papers, with a PhD from 2007. That's one hell of a steady rate ¯\_(ツ)_/¯

27.10.2025 18:39 👍 0 🔁 0 💬 1 📌 0

It's pretty crazy how the entire citation game has been manipulated. It's enough to give a quick look at Semantic Scholar for Bengio, who GScholar just gave 1M citations. SScholar gave 0.5M, but it's not only the number, it's the top citers

27.10.2025 18:36 👍 2 🔁 0 💬 2 📌 0

Pre-Training LLMs to Externalize Knowledge - Yoav Artzi YouTube video by IVADO

Recent IVADO talk is now on YouTube:
www.youtube.com/watch?v=ozHk...

Paper here:
Pre-training Limited Memory Language Models with Internal and External Knowledge
Linxi Zhao et al.
arxiv.org/abs/2505.15962

24.10.2025 17:06 👍 4 🔁 1 💬 0 📌 0

It definitely doesn't seem to hold in process, which lacks any similar regulation or structure. The (sci-fi-ish?) argument is that one cannot disentangle deployment/impact from development (i.e., one cannot shut it down).

23.10.2025 14:53 👍 0 🔁 0 💬 0 📌 0

Yoav Artzi

Latest posts by Yoav Artzi @yoavartzi.com