Zining Zhu (@zhuzining)

Nowadays LLMs are used a lot in reasoning. When we use them in regular tasks (more specifically: those that are covered in the model's training data), it's fine. However, using the models with new information, new rules, and new capabilities would require more caution. 5/n. n=5

06.02.2025 15:12 👍 0 🔁 0 💬 0 📌 0

Can LLMs reason? When the problems are grounded in the real world, the performance is good. Otherwise, there's a huge performance drop. 4/n

06.02.2025 15:12 👍 0 🔁 0 💬 1 📌 0

These are common properties for formal reasoning datasets, but have been very hard to incorporate in commonsense reasoning (which is usually considered a type of informal reasoning). 3/n

06.02.2025 15:12 👍 0 🔁 0 💬 1 📌 0

ACCORD allows (1) controllable reasoning path length, (2) controllable distraction items on the reasoning tree. These controls are (3) automatic and (4) scalable. 2/n

06.02.2025 15:12 👍 0 🔁 0 💬 1 📌 0

$\texttt{ACCORD}$: Closing the Commonsense Measurability Gap We present $\texttt{ACCORD}$, a framework and benchmark suite for disentangling the commonsense grounding and reasoning abilities of large language models (LLMs) through controlled, multi-hop counterf...

Let's bring in more formal reasoning properties in the commonsense reasoning datasets! Introducing ACCORD arxiv.org/abs/2406.02804, to be presented at #NAACL2025 w/ François Roewer-Després, Jinyue Feng and Frank Rudzicz. 1/n

06.02.2025 15:12 👍 1 🔁 0 💬 1 📌 0

A uniquely interesting book with a lot of new information, and I feel the urge to take notes (either to echo or to debate) while reading. Highly recommend.

22.12.2024 04:57 👍 2 🔁 1 💬 0 📌 0

Behind the graduate mental health crisis in science - Nature Biotechnology Survey results identify how scientific research and teaching contribute to the graduate student mental health crisis.

Nature Biotechnology

Behind the graduate mental health crisis in science
www.nature.com/articles/s41...

28.11.2024 13:02 👍 4 🔁 4 💬 1 📌 0

I know there are already plenty of tips out there on how to write an effective rebuttal, but I thought I’d share mine as well. I’m not claiming to be an expert or to have a perfect success rate, but I hope these suggestions might be helpful for anyone who could use them.

27.11.2024 04:30 👍 16 🔁 2 💬 1 📌 0

What are some recent papers that show making models explainable can also make them safer?

19.11.2024 22:08 👍 1 🔁 0 💬 0 📌 0

Hi I'm starting to use Bluesky!

19.11.2024 21:59 👍 4 🔁 0 💬 0 📌 0

Zining Zhu

Latest posts by Zining Zhu @zhuzining