NDIF Team (@ndif-team) — bluesky.baby

NNsight is part of a growing open-source ecosystem. We're building the infrastructure so you can focus on the science.

Upgrade to NNsight 0.6 today: pip install nnsight --upgrade

nnsight.net
github.com/ndif-team/nnsight
discuss.ndif.us

27.02.2026 17:14 👍 3 🔁 0 💬 0 📌 0

Introducing NNsight 0.6 - nnsight Documentation for the nnsight Python library

Read our blog post to learn more about the design of this release and its features: nnsight.net/blog/2026/02/26/introducing-nnsight-06/

27.02.2026 17:14 👍 2 🔁 0 💬 1 📌 0

GitHub - ndif-team/skills: Teach LLMs to use NNSight with Skills Teach LLMs to use NNSight with Skills. Contribute to ndif-team/skills development by creating an account on GitHub.

We also ship first-class support for AI coding agents, including skills for Claude Code and Codex, Context7 MCP server for live docs, and comprehensive guides in the repo.

github.com/ndif-team/skills

27.02.2026 17:14 👍 4 🔁 0 💬 2 📌 0

Other additions include:
- Clean error tracebacks that point to YOUR code, not NNsight internals
- Check NDIF before submitting jobs with ndif.status()
- Standard for step-in tracer.iter[:] generation loops (faster than with blocks!)

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

0.6 also comes with 2.4–3.9x performance improvements.

- Empty trace: 1196μs → 308μs
- 12 .save() calls: 1697μs → 716μs.

The big wins: always-on trace caching, persistent pymount, and batched variable sync. Setup cost dropped from ~1,100μs to ~210μs

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

NNsight 0.6 also introduces first-class support for VisionLanguageModel (e.g., LLaVA, Qwen-VL) and DiffusionModel (e.g., Stable Diffusion, Flux)! Available remote on NDIF soon 👀

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

vLLM integration got a major upgrade, now supporting single-GPU, multi-GPU tensor parallelism, Ray distributed execution, and even multi-node experiments, all using the same tracing API. NNsight handles the tensor gather/scatter, allowing you to intervene on unsharded tensors.

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

Also new is async mode with real-time token streaming. Build interactive apps like chat interfaces, or live visualizations with interventions running on every forward pass:

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

This is huge for the ecosystem. Libraries like NNterp can ship new features without waiting for NDIF to update. You always run whatever version you have locally.

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

Our #1 request: "I want to run my own analysis code on NDIF, not just inline interventions." Now, with NNsight source code serialization, you can! Your local packages work, even if NDIF doesn't have them installed.

27.02.2026 17:14 👍 3 🔁 0 💬 1 📌 0

NNsight 0.6 is out now! We directly address your feedback in our biggest release yet. Pain points included cryptic errors, slow traces, no remote execution of custom code, and limited vLLM support. We tackle all of these and more in this new release.

🧵 Here's what changed:

27.02.2026 17:14 👍 6 🔁 1 💬 1 📌 1

Auditing language models for hidden objectives We study the feasibility of conducting alignment audits: investigations into whether models have undesired objectives. As a testbed, we train a language model with a hidden objective. Our training...

For more details, read the paper: arxiv.org/abs/2503.10965

10.02.2026 18:45 👍 3 🔁 0 💬 0 📌 0

Red teams trained a model with a secret objective by exploiting RLHF reward models. Blue teams then audited the model, using techniques such as interpretability with sparse autoencoders, behavioral attacks, and training data analysis to successfully uncover the hidden objective.

10.02.2026 18:45 👍 2 🔁 0 💬 1 📌 0

Auditing Language Models for Hidden Objectives with Sam Marks Sam Marks leads Anthropic's Cognitive Oversight team, a subteam of Alignment Science. Sam's research focuses on settings where understanding something about ...

Watch Sam Marks present his work "Auditing Language Models for Hidden Objectives" in our new YouTube video! Sam's team ran a blind auditing game to assess efficacy of black box and white box techniques for LLM alignment auditing.

🔗 youtu.be/jZiOJTHqB6M

10.02.2026 18:45 👍 3 🔁 0 💬 1 📌 0

Big thanks to the whole organizing team, especially @neelnanda.bsky.social and
Andy Arditi, for hosting such a great workshop and inviting us to speak!

04.02.2026 19:38 👍 2 🔁 0 💬 0 📌 0

NSF National Deep Inference Fabric NDIF is a research computing project that enables researchers and students to crack open the mysteries inside large-scale AI systems.

Adam Belfki discusses NDIF and Workbench (youtu.be/zmHyaHiw8XU)

- workbench.ndif.us/
- ndif.us/
- nnsight.net/

04.02.2026 19:38 👍 2 🔁 0 💬 1 📌 0

The Dual-Route Model of Induction Do LLMs copy meaningful text by rote or by understanding meaning? Webpage for The Dual-Route Model of Induction (Feucht et al., 2025).

@sfeucht.bsky.social presents their work on concept induction heads (youtu.be/Jc-sTXW31W0)

- dualroute.baulab.info/

04.02.2026 19:38 👍 2 🔁 0 💬 1 📌 0

"In Defense of Curiosity" with David Bau (NeurIPS 2025 Mech Interp Workshop) David Bau presents his thoughts on "Pragmatic Interpretability" by recounting the history of Venetian glassmakers at the NeurIPS 2025 Mechanistic Interpretab...

@davidbau.bsky.social shares his thoughts on pragmatic interpretability (youtu.be/iMIsg32mVHM)

- davidbau.com/archives/20...
- In response to: www.alignmentforum.org/posts/StENz...

04.02.2026 19:38 👍 2 🔁 0 💬 1 📌 0

NeurIPS 2025 Mechanistic Interpretability Workshop Talks from NDIF and Bau Lab at the NeurIPS 2025 Mech Interp Workshop

Check out the NDIF & Bau Lab lightning talks at the NeurIPS 2025 Mechanistic Interpretability Workshop (mechinterpworkshop.com/): youtube.com/playlist?li...

04.02.2026 19:38 👍 4 🔁 1 💬 1 📌 0

Annus Mirabilis: A Year of Explosive Progress in LLMs with Benjamin Feuer YouTube video by NDIF Team

New YouTube video posted! @benjaminfeuer.bsky.social discusses LLM's annus mirabilis, presenting his work on open questions surrounding LLM judges, benchmark trustworthiness, and maximizing the potential of synthetic data.

Watch here: www.youtube.com/watch?v=pehc...

28.01.2026 00:43 👍 2 🔁 0 💬 0 📌 0

🔥I am super excited for the official release of an open-source library we've been working on for about a year!

🪄interpreto is an interpretability toolbox for HF language models🤗. In both generation and classification!

Why do you need it, and for what?

1/8 (links at the end)

20.01.2026 16:03 👍 20 🔁 9 💬 1 📌 3

Deepti Ghadiyaram [Email] [Google Scholar] [Twitter] [LinkedIn] I am an Assistant Professor at Boston University in the Department of Computer Science . I am also an Affiliated Faculty with the Department of Electrical and Computer Engineering and Faculty of Computing & Data Sciences and an academic collaborator with Runway. My research interests...

📄 Paper: arxiv.org/abs/2411.16725

💻 Code & Visualizations: github.com/revelio-dif......

🌐 Deepti's Website: deeptigp.github.io/

15.01.2026 21:20 👍 2 🔁 0 💬 0 📌 0

Interpreting and Leveraging Diffusion Representations with Deepti Ghadiyaram Deepti Ghadiyaram is an Assistant Professor at Boston University in the Department of Computer Science, with affiliated appointments in Electrical and Comput...

New year, new YouTube videos! We are resuming our regular interpretability seminar posts, with a fantastic talk by Deepti Ghadiyaram on interpreting diffusion models.

Watch the video: youtu.be/4eqvABPX5rA

15.01.2026 21:20 👍 4 🔁 2 💬 1 📌 0

So excited to have you on the team, @gsarti.com!

09.01.2026 21:39 👍 4 🔁 0 💬 0 📌 0

GitHub - ndif-team/nnterp: Unified access to Large Language Model modules using NNsight Unified access to Large Language Model modules using NNsight - ndif-team/nnterp

Report issues or contribute to the open-source project: github.com/ndif-team/n...

09.01.2026 21:38 👍 1 🔁 0 💬 0 📌 0

Add support for new models (or custom ones): ndif-team.github.io/nnterp/addi...

09.01.2026 21:38 👍 0 🔁 0 💬 1 📌 0

Try out built-in interventions like logit lens and patchscope: ndif-team.github.io/nnterp/inte...

09.01.2026 21:38 👍 0 🔁 0 💬 1 📌 0

nnterp by @butanium.bsky.social is now part of the NDIF ecosystem! nnterp standardizes transformer naming conventions, includes built-in best practices for common interventions, and is perfectly compatible with original HF model implementations.

Learn more: ndif-team.github.io/nnterp/

09.01.2026 21:38 👍 4 🔁 1 💬 1 📌 1

NNsight 0.5.13 Release: vLLM integration and performance improvements Excited to announce our new NNsight version, nnsight v0.5.13! This release re-integrates support vLLM into NNsight, along with introducing performance improvements. To learn more, check out the release notes below and the vLLM tutorial. Please use this thread to provide feedback on vLLM integration and any other issues concerning this release. Release Notes: 1. nnsight support for vLLM inference has been complexly refactored and works with the latest version of vLLM, including tensor paral...

Submit feedback: discuss.ndif.us/t/nnsight-0...

19.12.2025 22:51 👍 0 🔁 0 💬 0 📌 0

Release v0.5.13 · ndif-team/nnsight Release Notes: v0.5.13 1. nnsight support for vLLM inference has been complexly refactored and works with the latest version of vLLM, including tensor parallelism. Enabling fast inference on multi...

Release notes: github.com/ndif-team/n...

19.12.2025 22:51 👍 0 🔁 0 💬 1 📌 0

NDIF Team

Latest posts by NDIF Team @ndif-team