Angie Boggust's Avatar

Angie Boggust

@angieboggust

MIT PhD candidate in the VIS group working on interpretability and human-AI alignment

422
Followers
250
Following
11
Posts
19.11.2024
Joined
Posts Following

Latest posts by Angie Boggust @angieboggust

Preview
Workshop on Visualization for AI Explainability The role of visualization in artificial intelligence (AI) gained significant attention in recent years. With the growing complexity of AI models, the critical need for understanding their inner-workin...

#VISxAI IS BACK!! ๐Ÿค–๐Ÿ“Š

Submit your interactive โ€œexplainablesโ€ and โ€œexplorablesโ€ that visualize, interpret, and explain AI. #IEEEVIS

๐Ÿ“† Deadline: July 30, 2025

visxai.io

07.05.2025 21:56 ๐Ÿ‘ 7 ๐Ÿ” 4 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Iโ€™ll be at #CHI2025 ๐ŸŒธ

If you are excited about interpretability and human-AI alignment โ€” letโ€™s chat!

And come see Abstraction Alignment โฌ‡๏ธ in the Explainable AI paper session on Monday at 4:20 JST

24.04.2025 13:05 ๐Ÿ‘ 5 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual Relationships While interpretability methods identify a model's learned concepts, they overlook the relationships between concepts that make up its abstractions and inform its ability to generalize to new data. To ...

Check out Abstraction Alignment at #CHI2025!

๐Ÿ“„Paper: arxiv.org/abs/2407.12543
๐Ÿ’ปDemo: vis.mit.edu/abstraction-...
๐ŸŽฅVideo: www.youtube.com/watch?v=cLi9...
๐Ÿ”—Project: vis.mit.edu/pubs/abstrac...

With Hyemin (Helen) Bang, @henstr.bsky.social, and @arvind.bsky.social

14.04.2025 15:48 ๐Ÿ‘ 3 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Abstraction Alignment reframes alignment around conceptual relationships, not just concepts.

It helps us audit models, datasets, and even human knowledge.

I'm excited to explore ways to ๐Ÿ— extract abstractions from models and ๐Ÿ‘ฅ align them to individual users' perspectives.

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

Abstraction Alignment works on datasets too!

Medical experts analyzed clinical dataset abstractions, uncovering issues like overuse of unspecified diagnoses.

This mirrors real-world updates to medical abstractions โ€” showing how models can help us rethink human knowledge.

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Two examples of Abstraction Alignment applied to a language model.

Two examples of Abstraction Alignment applied to a language model.

Language models often prefer specific answers even at the cost of performance.

But Abstraction Alignment reveals that the concepts an LM considers are often abstraction-aligned, even when itโ€™s wrong.

This helps separate surface-level errors from deeper conceptual misalignment.

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
A screenshot of the Abstraction Alignment interface.

A screenshot of the Abstraction Alignment interface.

And we packaged Abstraction Alignment and its metrics into an interactive interface so YOU can explore it!

๐Ÿ”—https://vis.mit.edu/abstraction-alignment/

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Aggregating Abstraction Alignment helps us understand a modelโ€™s global behavior.

We developed metrics to support this:
โ†”๏ธ Abstraction match โ€“ most aligned concepts
๐Ÿ’ก Concept co-confusion โ€“ frequently confused concepts
๐Ÿ—บ๏ธ Subgraph preference โ€“ preference for abstraction levels

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

Abstraction Alignment compares model behavior to human abstractions.

By propagating the model's uncertainty through an abstraction graph, we can see how well it aligns with human knowledge.

E.g., confusing oaks๐ŸŒณ with palms๐ŸŒด is more aligned than confusing oaks๐ŸŒณ with sharks๐Ÿฆˆ.

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Video thumbnail

Interpretability identifies models' learned concepts (wheels ๐Ÿ›ž).

But human reasoning is built on abstractions โ€” relationships between concepts that help us generalize (wheels ๐Ÿ›žโ†’ car ๐Ÿš—).

To measure alignment, we must test if models learn human-like concepts AND abstractions.

14.04.2025 15:48 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
An overview of Abstraction Alignment, including its authors and links to the paper, demo, and code.

An overview of Abstraction Alignment, including its authors and links to the paper, demo, and code.

#CHI2025 paper on humanโ€“AI alignment!๐Ÿงต

Models can learn the right concepts but still be wrong in how they relate them.

โœจAbstraction Alignmentโœจevaluates whether models learn human-aligned conceptual relationships.

It reveals misalignments in LLMs๐Ÿ’ฌ and medical datasets๐Ÿฅ.

๐Ÿ”— arxiv.org/abs/2407.12543

14.04.2025 15:48 ๐Ÿ‘ 9 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 2

Hey Julian โ€” thank you so much for putting this together! My research is on interpretability and Iโ€™d love to be added.

24.11.2024 14:21 ๐Ÿ‘ 6 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0