Are the codeforces elo results not interesting?
25.12.2024 22:20
👍 0
🔁 0
💬 1
📌 0
Are the codeforces elo results not interesting?
We just updated the ArXiv version!
*Automatically Interpreting Millions of Features in LLMs*
by @norabelrose.bsky.social et al.
An open-source pipeline for finding interpretable features in LLMs with sparse autoencoders and automated explainability methods from @eleutherai.bsky.social.
arxiv.org/abs/2410.13928