92 is wild this early. Here in the last week weโve had 71 F, tornados, baseball sized hail, then snow and maybe 9 inches more snow this weekend? What is this??
92 is wild this early. Here in the last week weโve had 71 F, tornados, baseball sized hail, then snow and maybe 9 inches more snow this weekend? What is this??
Thank you Geoff! We should touch base sometime. I wonder if there are connections to be made between our model publishing efforts, TorchSim GPU MLIP software, and unique datasets (e.g., OMol25 we have millions of wave functions on hand), and more. Cheers!
Almost certainly every paper will be touched by AI/ML tools and the associated software stack in the not too distant future. That's just the way these technologies diffuse.
If youโd like to reuse or remix the figures for talks, papers, or slides, everything is available here: github.com/blaiszik/ml_...
To make the data easier to reuse, the repository includes:
+ Raw data queries, and data in csv format
+ Notebooks used to generate the plots
+ Both normalized and raw publication charts
This is quite a long-term trend as well. Across these fields, 5-year compound growth rates remain extremely high (Materials: 37.3%, Chemistry: 28.3%, Physics 27.9%) suggesting that the integration of machine learning into core scientific workflows is still accelerating.
Updated AI/ML publication charts to include data from 2025! After what looked like an approaching S-curve top in 2024, the 2025 growth looks more bullish.
In 2025, Web of Science data shows:
๐ธ Materials: 12,987 papers +36.9% YOY
๐น Chemistry: 16,522 papers +40.8%
๐ธ Physics: 11,955 papers +27.5%
Over the next month, we'll be releasing major updates to the Materials Data Facility! We will maintain support for all previous features, but there will be a lot of new ones. What are critical aspects you would like to see in a modern data repository?
If you're interested in testing, send me a DM!
Cloudflare rebuilt Next.js in a week with AI for $1,100. I wrote about how this could become a defining moment for the scientific community to erase decades of tech debt and build the software tools that we have imagined, to accelerate progress. Let's go!
www.linkedin.com/pulse/what-w...
I was looking at my dissertation defense slides recently and was fondly like... "Wow, this is what it's like to focus on one project a year."
The second panel will feature researchers from industry including Santiago Miret (Lila), Abhijeet Gangan (Periodic), Lily Kim (Microsoft Discovery), and Tim Erdmann (IBM)
More details on the panel here, with registration: www.marda-alliance.org/2026-marda-a...
The Materials Research Data Alliance meeting is next week! Great panelists and opportunity for connection. Registration is free, and content is available online worldwide.
Sign up to hear from key speakers: James Warren (NIST), John Schlueter (NSF), Andrew Schwartz (DOE), and Sean Donegan (AFRL)
If your team works on ML for materials/molecular discovery and needs someone who has built physics-informed networks for real problems, check out Zhihao Feng. Genuine problem-solver, strong collaborator, adaptable. Here's an example of his work.
Link: www.linkedin.com/posts/fengzh...
What are the biggest software-solvable problems holding you back in your research?
e.g.,
- There's no Python package to convert formats between X and Y
- Running tasks on HPC systems requires expertise I don't have
- Formatting or tracking the references for my paper is annoying
As a kid, I was outdoors and in the woods all the time. The pandemic reminded me how awesome outside is. I still do as many meetings as possible on walks.
or you'll just publish...much more. ๐
"If I had a million dollars
We wouldn't have to eat Kraft Dinner
But we would eat Kraft Dinner
Of course we would, we'd just eat more"
The 29th is a typo on the final day, it is in fact the 19th :)
There will also be opportunities to connect with others interested in these topics through poster sessions and a community Slack.
The event is online, so available around the world. Full speaker list soon!
The 4th Materials Research Data Alliance meeting is set for Feb 17-19 on topics in materials + data + AI. This is a great opportunity to hear from speakers from academia, national labs, and industry.
Register now (free - online): marda-alliance.org/2026-marda-a...
I'm excited to help improve our understanding of catalysis and fusion materials via the Integrated Scientific Agentic AI for Catalysis (ISAAC) and CascAIde projects; and to see Rick Stevens and @ianfoster leading ModCon to develop/collect the models/data needed to advance science. Stay tuned!
DOE announces $320M in new investments in the AI4Science!
You can read more about the Genesis Mission, the American Science Cloud, the Transformational AI Models Consortium (ModCon), along with 51 AI in science application projects here and in the attached image.
๐ energy.gov/articles/ene...
Amazing week of results from AI4Science with NeurIPS and MRS. But, for those who couldn't make it, post a link to your research here we can boost you too!
I saved this image from Stable Diffusion in Dec 2022 because it was amazing a model could output such detail. ๐
Full details and package here. github.com/TorchSim/tor...
Paper: arxiv.org/abs/2508.06628
Thanks to the 10 contributors! Thomas Loux, Curtis Chong, Rhys Goodall, Orion Archer Cohen, Will Engler, Abhijeet Gangan, Andrew_S_Rosen and others
Bonus, a fun NanoBanana interpretation of the TorchSim paper
New features include mixed periodic boundary condition (PBC) support, NVT Nose Hoover method, and more. This release also enables important integrations with QUACC and Atomate2 to proceed. More on that soon!
TorchSim continues its growth aiming to be the high-performance engine for MLIP-powered atomistic simulation.
With this release (4.1), there improvements across the entire stack with new features, bug fixes, and improved documentation.
So your data are available upon reasonable request? Well, we are making some reasonable requests - at scale. :)
1. Search literature (currently stubbed)
2. Enumerate papers, extract contacts
3. Send email w/ data drop location
4. Parse data
Does anyone want to help productionize this?