Overall, these spot talks were a gem. There don't seem to be recordings, and I hope the slides could be released. Already looking forward to next year! #NeurIPS2025 #NeurIPSanDiego
Overall, these spot talks were a gem. There don't seem to be recordings, and I hope the slides could be released. Already looking forward to next year! #NeurIPS2025 #NeurIPSanDiego
Dec 4: Alex Smola's "Boson.AI Talk to me - Engineering Conversational Intelligence" was a great presentation. The talk covers data collection, model design, and alignment for high-quality voice AI, which're key ingredients to train models that sound realistic. Try the demo: www.boson.ai/demo/shop
Dec 3: Shixiong Zhang and Genta Winata introduced Capital One's T1 dataset. This is a tool-augmented, multi-domain, multi-turn conversational dataset designed for agent planning. Loved seeing how T1-Agent handles complex, dependency-heavy workflows. Paper link: arxiv.org/abs/2505.16986
Dec 3: Zhengzhong Liu and Jiannan Xiang shared βTowards a Blueprint for Open Science of Foundation Models.β The talk presents an interactive, long-horizon world model that predicts future states via high-quality video simulation. Work done at IFM@MBZUAI. Read the paper: arxiv.org/abs/2511.09057
Dec 2: David Cox's talk, "From Agent Soup to Proper Software Design" offered a refreshing take on building reliable LLM systems. Loved the pitch behind Mellea, a generative AI library that gives developers more control through software design principles. YouTube link: www.youtube.com/watch?v=j2ou...
Really enjoyed the Exhibitor Spot Talks at #NeurIPS this year! These are 12-minute short talks packed with interesting ideas. Some of the most fun talks I attended are:
That makes it difficult to compare systems across domains, or figure out which one's best for a new planning problem.
That's where our paper comes in: We offer a comprehensive overview of LLM planning agents, highlighting gaps, challenges, and what's next.
Check it out π arxiv.org/abs/2502.11221
π§ Planning is a core aspect of both human and artificial intelligence.
LLMs/agents have been used in various planning tasks, from navigating websites and planning trips to querying databases, but most benchmarks are narrow and task-specific.
π Thrilled that our paper #PlanGenLLMs (arxiv.org/abs/2502.11221) won the SAC Award at #ACL2025!!
Couldn't have done it without the amazing team: Hui Wei, Zihao Zhang, Shenghua He, Tian Xia, and Shijia Pan. So thankful and beyond proud! π #ACL2025NLP #NLProc
Happy to share our paper got selected as an Oral Presentation at #ACL2025!
Out of 8,000+ submissions and 3,000+ accepted papers, only 245 were chosen for oral (<3%)!
π Paper: arxiv.org/abs/2502.11221
π» Resource: github.com/wll199566/Aw...
Autonomous agents are powerful, but without guardrails, they drift into inefficiency.
We view 'cost' as a form of guardrail and use Monte Carlo Tree Search with explicit cost-awareness to guide LLM-based planning.
Link: arxiv.org/pdf/2505.14656