Check out Yu Zhao's (@yuzhaouoe.bsky.social) latest work, “Learning GUI Grounding with Spatial Reasoning from Visual Feedback” (www.arxiv.org/abs/2509.21552), done during his internship at MSR (@msftresearch.bsky.social)!
New SOTA 🏆 results on ScreenSpot-v2 (+5.7%) and ScreenSpot-Pro (+110.8%)!
17.10.2025 09:23
👍 2
🔁 1
💬 0
📌 0
💡 We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).
Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/
23.05.2025 12:23
👍 4
🔁 2
💬 1
📌 0
MMLU-Redux Poster at NAACL 2025
MMLU-Redux just touched down at #NAACL2025! 🎉
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope 😅
If anyone's swinging by, give our research some love! Hit me up if you check it out! 👋
02.05.2025 13:00
👍 17
🔁 11
💬 0
📌 0
We find a single biased direction encodes a KV Cache selection mechanism in Self-Attention -- Key vector with a strong component in this direction results in this Key-Value pair being ignored by Query🚀🚀🚀
06.03.2025 16:34
👍 3
🔁 0
💬 0
📌 0
New and very cool library!👏 Our L2 Norm-based KV Cache compression is already implemented - ready to use! 🚀
Check out the method details in our EMNLP '24 paper: arxiv.org/abs/2406.11430
20.11.2024 09:57
👍 13
🔁 2
💬 1
📌 0
I’ll be travelling to London from Wednesday to Friday for an upcoming event and would be very happy to meet up! 🚀
I'd love to chat about my recent works (DeCoRe, MMLU-Redux, etc.). DM me if you’re around! 👋
DeCoRe: arxiv.org/abs/2410.18860
MMLU-Redux: arxiv.org/abs/2406.04127
18.11.2024 13:48
👍 11
🔁 7
💬 0
📌 0