The latest Microsoft Research Forum episode is now available on-demand. Explore new ARO, Dion2, Magentic Marketplace, OptiMind, Agent Lightning, and Healthbots. Register to watch: events.microsoft.com/flow/ms/rese...
The latest Microsoft Research Forum episode is now available on-demand. Explore new ARO, Dion2, Magentic Marketplace, OptiMind, Agent Lightning, and Healthbots. Register to watch: events.microsoft.com/flow/ms/rese...
White line icons against a blue-green gradient background form an architecture flow chart. In the middle of the chart is a three-by-three matrix of circles and lines within a round-edge square. Above the matrix, three icons in a row β an equation, a person using a desktop, and a head with gears flow by dotted lines to the matrix. To the left of the matrix is an icon representing a stack of files with an arrow pointing to the matrix. To the right of the matrix is a graph with a double headed arrow pointing to the matrix and to itself. Below the matrix is an icon representing a document. A dotted line arrow connects this graph to the matrix, showing the direction flowing from the matrix to the document. To the right of the document icon is an hourglass icon and three list icons with a dotted line connecting the hourglass to the lists.
Vision-language models improve multimodal systems, but can make them slower, costlier, and harder to deploy. Learn how Phi-4-reasoning-vision-15B, a compact and fast multimodal reasoning model, blends strengths of different methods while reducing their limits: msft.it/6014Q5X0u
CORPGEN enables AI agents to manage dozens of interdependent tasks simultaneously in simulated workplace environments. It maintains performance under heavy multitasking, delivering up to 3.5x higher completion rates than leading baselines. msft.it/6015QbHoH
Long-term glass data storage advances, new work on transferable reasoning, multi-turn AI safety testing, multilingual AI design, and evaluating how models actually think. msft.it/6015QksvJ
Three white outline icons on a blue-to-pink gradient background: an image with a copyright badge, an image overlaid with fingerprint-like lines, and an image framed by a cropping grid.
As synthetic media grows, verifying whatβs real, and the origin of content, matters more than ever. Our latest report explores media integrity and authentication methods, their limits, and practical paths toward trustworthy provenance across images, audio, and video. msft.it/6012QnGgi
A blue-to-green gradient background featuring three white icons: a networked globe on the left, a cloud in the center, and a stacked database on the right.
Project Silica introduces new techniques for encoding data in borosilicate glass, as described in the journal Nature. These advances lower media cost and simplify writing and reading systems while supporting 10,000-year data preservation: msft.it/6017QVklt
Coming March 3 at 9:00 AM PT, our first Microsoft Research Forum episode of the year.
New ARO, Dion2, Magentic Marketplace, OptiMind, Agent Lightning, and Healthbots.
Register to watch: www.microsoft.com/en-us/resear...
We are pleased to announce that Doug Burger, Technical Fellow and Corporate Vice President at Microsoft Research, has been elected to the National Academy of Engineering Class of 2026 for advancing cloud-scale computing and networking with field-programmable systems. msft.it/6011Quo5H
Speech recognition for low-resource languages, medical foundation models, space-based ML, and brain-computer interfaces: msft.it/6016QPDpp
This research looks at why Predictive Inverse Dynamics Models often outperform standard Behavior Cloning in imitation learning. By using simple predictions of what happens next, PIDMs reduce ambiguity and learn from far fewer demonstrations. Learn more: msft.it/6018QMkdO
Three white line icons on a blue to purple gradient background: a vertical audio waveform on the left, a globe showing Africa and Europe in the center, and a network on the right.
Microsoft Research unveils Paza, a human-centered speech pipeline, and PazaBench, the first leaderboard for low-resource languages. It covers 39 African languages and 52 models and is tested with communities in real settings. msft.it/6016QMDHe
Give this a second thought: when language is a barrier to access, AI can help play translator. Hereβs how weβre building technology that reflects culture and lived experience.
Explore recent episodes of 'On Second Thought' here: msft.it/6019QKh0f
Image with a purple gradient background featuring the Association for the Advancement of Artificial Intelligence (AAAI) logo and name on the left. On the right is a tilted preview of an academic paper titled LLM2CLIP: Powerful Language Model Unlocks Richer Cross Modality Representation.
Microsoft researchers received the AAAI-26 Outstanding Paper Award for LLM2CLIP, a vision-language framework that uses large language models as βteachersβ to help CLIP better understand long, complex captions and achieve state-of-the-art multimodal performance. msft.it/6017QHtL1
In a new paper published in Communications of ACM (CACM), Microsoft researchers explore how a βweb of agentsβ could enable an open, decentralized agentic economyβavoiding walled gardens that limit innovation, concentrate power, and reduce user choice. Read more: msft.it/6043QHUD9
Three white icons on a blue green gradient: a ribcage scan, a circuit style document, and a neural network diagram.
AI can help generate medical image reports, but todayβs models struggle with varying reporting schemes. Learn how UniRG uses reinforcement learning to boost performance of medical vision-language models: msft.it/6018Q1QTn
Abstract graphic featuring concentric arcs and rectangular segments in varying shades of blue on a white background. Two purple text boxes appear: one on the left reading Research Focus and one on the right reading January 26, 2026.
AI models that bring language-driven control to the physical world, new work on billion-scale vector search, spatially consistent video generation, tools for analyzing ML-driven cloud systems, and a CVPR 2026 workshop on multimodal AI agents. msft.it/6018Q1Hn8
Madan Musuvathi has been named an ACM Fellow by the Association for Computing Machinery, recognizing his foundational work in concurrency verification and his impact on modern machine learning systems design. Congratulations! msft.it/6014Q8Wvk
The question isnβt whether AI or doctors will shape healthcare. Itβs how AI and clinicians can work together to deliver better care.
Watch Episode 1 of On Second Thought with @sineadbovell.bsky.social for the full conversation. msft.it/6014Q8mIq
A dual-arm robot with two-finger grippers is unplugging a power adapter from a white power strip on a workbench.
Rho-alpha, which translates natural language commands into control signals for robotic systems doing bimanual manipulation tasks, aims to make physical systems more adaptable by using physical sensing modalities like touch and continuous learning from human feedback. msft.it/6018QBUCn
Argos improves multimodal RL by evaluating whether an agentβs reasoning aligns with what it observes over time. The approach reduces visual hallucinations and produces more reliable, data-efficient agents for real-world applications: msft.it/6010QBEgy
βWeβre really diverse yet medicine has to operate off of averages.β Futurist @sineadbovell.bsky.social explores with Microsoftβs Jonathan Carlson how AI is helping to unlock a new era of personalized medicine. Watch episode 1, live now: www.youtube.com/watch?v=WKrG...
OptiMind is a small language model that converts business operation challenges, described naturally, into mathematical formulations that optimization software can solve. It reduces formulation time & errors & enables fast, privacy-preserving local use: msft.it/6017t7ISB
Abstract graphic featuring concentric arcs and rectangular segments in varying shades of blue on a white background. Two purple text boxes appear: one on the left reading βResearch Focusβ and one on the right reading βJanuary 12, 2026.β
A look at whatβs next in AI, plus new research in multimodal reasoning, long-horizon robotics, scalable self-supervised learning, GPU optimization with AI, and interpretable LLM reasoning. msft.it/6012tfazi
Abstract circular design with green segments on white background, text reads βResearch Focus, December 19, 2025.
2025 saw groundbreaking innovations including AI-powered materials discovery tools, protein structure modeling, and multilingual AI for underserved communities. Dive into our Year in Review for a look at these and other transformative advances. msft.it/6011tUVUJ
Gradient background transitioning from blue at the top to pink at the bottom, with small white dots scattered across the blue area. A white outlined square frame encloses bold white text reading β2025β and smaller text below stating βMicrosoft Research Year in Review.β
From AI-driven material discovery and protein modeling to agentic systems, gaming, and multilingual AI for the global majority, 2025 turned research into real-world impact. Explore the breakthroughs in Microsoft Researchβs Year in Review. msft.it/6013tUPR3
Holoportationβ’ technology, enabling real-time 3D telecommunications, has evolved from the lab to real-world use. After a decade of refinement and real-world deployment, it's been released via open source license to encourage wider use and development: msft.it/6010toiZQ
The Agentic AI Research and Innovation (AARI) initiative brings Microsoft Research and Academic Researchers together to advance safe, robust agentic systems from foundational science to real-world impact through open, global collaboration. msft.it/6045tWLfv
Microsoft Chief Scientific Officer Eric Horvitz on what stood out at NeurIPS 2025, whatβs next, and why these moments matter for the future of AI research.
The Microsoft Research Asia StarTrack Scholars Program is a three-month experience designed to foster global collaboration and accelerate frontier research. Applications are due today, December 15. Details on research fields, programs, and application procedures are available at: msft.it/6042tm1oI
ICYMI: Microsoft Chief Product Officer of Responsible AI Sarah Bird talks NeurIPS, the 20th anniversary of WiML, and what's next in RAI.