Amartya Sanyal (@amartyasanyal)

A classifier has an error of 0.15 and unfairness violation of 0.13, while the same classifier, on data with 6 relabeled samples, have the the same error but 0.03 fairness violation.

In our new work we ask: Can end-users make a platform’s ML models fairer?

Firm-side fair learning often reduces accuracy, discouraging firms from using it. But if a platform relies on user data, can minority users collectively change the data to induce fairness?

(1/4)

22.08.2025 06:45 👍 1 🔁 2 💬 1 📌 0

$An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise Johanna Düngler, Amartya Sanyal http://arxiv.org/abs/2508.10879 Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $\Sigma$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $\Sigma$, while preserving differential privacy (DP) of each individual $A_i$. Existing methods either (i) require the sample size $n$ to scale super-linearly with dimension $d$, even under Gaussian assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed these issues for sub-Gaussian data but only for estimating the top eigenvector ($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming both limitations above. For $k=1$ our algorithm matches the utility guarantees of DP-PCA, achieving near-optimal statistical error even when $n = \tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching our upper bound up to a factor of $k$, and experimentally demonstrate the advantages of our algorithm over comparable baselines.$

An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise Johanna Düngler, Amartya Sanyal http://arxiv.org/abs/2508.10879 Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share a common expectation $\Sigma$, the objective of Differentially Private Stochastic PCA is to identify a subspace of dimension $k$ that captures the largest variance directions of $\Sigma$, while preserving differential privacy (DP) of each individual $A_i$. Existing methods either (i) require the sample size $n$ to scale super-linearly with dimension $d$, even under Gaussian assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed these issues for sub-Gaussian data but only for estimating the top eigenvector ($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming both limitations above. For $k=1$ our algorithm matches the utility guarantees of DP-PCA, achieving near-optimal statistical error even when $n = \tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching our upper bound up to a factor of $k$, and experimentally demonstrate the advantages of our algorithm over comparable baselines.

An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise

Johanna Düngler, Amartya Sanyal

http://arxiv.org/abs/2508.10879

15.08.2025 03:50 👍 3 🔁 1 💬 0 📌 0

Technically, we
1⃣ formalise the online learning unlearning (OLU) problem setting
2⃣ propose two styles of OLU algorithms
3⃣ In the Online Cvx Optimisation (OCO) framework, we nearly match the Regret guarantees of standard OCO without unlearning

15.05.2025 15:47 👍 1 🔁 0 💬 0 📌 0

🚨New Paper: Online Learning and Unlearning 🚨

We look at learning and unlearning in the online setting where both learning and unlearning requests arrive continuously over time.

Lead by @yaxihu.bsky.social and joint work with Bernhard Schölkopf

arxiv.org/abs/2505.08557

15.05.2025 15:46 👍 7 🔁 1 💬 1 📌 0

𝗖𝗮𝗻 𝘄𝗲 𝗮𝗹𝗶𝗴𝗻 𝗟𝗟𝗠𝘀 𝘁𝗼𝘄𝗮𝗿𝗱𝘀 𝗮 𝗱𝗲𝘀𝗶𝗿𝗲𝗱 𝗯𝗲𝗵𝗮𝘃𝗶𝗼𝘂𝗿 𝘄𝗵𝗶𝗹𝗲 𝗺𝗮𝗶𝗻𝘁𝗮𝗶𝗻𝗶𝗻𝗴 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁𝗶𝗮𝗹 𝗽𝗿𝗶𝘃𝗮𝗰𝘆 𝗴𝘂𝗮𝗿𝗮𝗻𝘁𝗲𝗲𝘀?
We answer this in our #ICLR2025 paper.

Tl;dr: We propose, evaluate and audit a novel differentially private activation steering algorithm for aligning LLMs.

(1/🧵)

23.04.2025 09:15 👍 4 🔁 1 💬 1 📌 0

Vacancies

Open postdoc position in learning theory/ privacy/ robustness/ unlearning or any related topics with me and others in University of Copenhagen, Denmark.

If you think you would be a good candidate, send me an email amartya18x.github.io/hiring/

#postdoc

27.03.2025 20:23 👍 4 🔁 0 💬 0 📌 0

IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), April 9-11, 2025, Copenhagen.

No plans for April 9–11 yet? — Why not spend an amazing week in beautiful Copenhagen 🇩🇰, exploring cutting-edge research on trustworthy machine learning.

Join us at SaTML 2025, the premier conference on AI security, AI privacy, and AI fairness!

👉 satml.org/attend

@satml.org

03.03.2025 15:03 👍 7 🔁 4 💬 0 📌 0

Very shortly at @realaaai.bsky.social, @alext2.bsky.social and I will be giving a Tutorial on the impact of Quality and availability of labels and data for Privacy, Fairness, and Robustness of ML algorithms

See here amartya18x.github.io/files/Tutori...

@ucph.bsky.social

25.02.2025 12:27 👍 4 🔁 1 💬 0 📌 1

IEEE SaTML IEEE Conference on Secure and Trustworthy Machine Learning

3rd IEEE Conference on Secure and Trustworthy Machine Learning
University of Copenhagen, Denmark, April 9-11, 2025 - registration is open. satml.org

@amartyasanyal.bsky.social

22.02.2025 16:43 👍 7 🔁 2 💬 0 📌 0

$Differentially Private Steering for Large Language Model Alignment Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal http://arxiv.org/abs/2501.18532 Aligning Large Language Models (LLMs) with human values and away from undesirable behaviors (such as hallucination) has become increasingly important. Recently, steering LLMs towards a desired behavior via activation editing has emerged as an effective method to mitigate harmful generations at inference-time. Activation editing modifies LLM representations by preserving information from positive demonstrations (e.g., truthful) and minimising information from negative demonstrations (e.g., hallucinations). When these demonstrations come from a private dataset, the aligned LLM may leak private information contained in those private samples. In this work, we present the first study of aligning LLM behavior with private datasets. Our work proposes the \textit{\underline{P}rivate \underline{S}teering for LLM \underline{A}lignment (PSA)} algorithm to edit LLM activations with differential privacy (DP) guarantees. We conduct extensive experiments on seven different benchmarks with open-source LLMs of different sizes (0.5B to 7B) and model families (LlaMa, Qwen, Mistral and Gemma). Our results show that PSA achieves DP guarantees for LLM alignment with minimal loss in performance, including alignment metrics, open-ended text generation quality, and general-purpose reasoning. We also develop the first Membership Inference Attack (MIA) for evaluating and auditing the empirical privacy for the problem of LLM steering via activation editing. Our attack is tailored for activation editing and relies solely on the generated texts without their associated probabilities. Our experiments support the theoretical guarantees by showing improved guarantees for our \textit{PSA} algorithm compared to several existing non-private techniques.$

Differentially Private Steering for Large Language Model Alignment Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal http://arxiv.org/abs/2501.18532 Aligning Large Language Models (LLMs) with human values and away from undesirable behaviors (such as hallucination) has become increasingly important. Recently, steering LLMs towards a desired behavior via activation editing has emerged as an effective method to mitigate harmful generations at inference-time. Activation editing modifies LLM representations by preserving information from positive demonstrations (e.g., truthful) and minimising information from negative demonstrations (e.g., hallucinations). When these demonstrations come from a private dataset, the aligned LLM may leak private information contained in those private samples. In this work, we present the first study of aligning LLM behavior with private datasets. Our work proposes the \textit{\underline{P}rivate \underline{S}teering for LLM \underline{A}lignment (PSA)} algorithm to edit LLM activations with differential privacy (DP) guarantees. We conduct extensive experiments on seven different benchmarks with open-source LLMs of different sizes (0.5B to 7B) and model families (LlaMa, Qwen, Mistral and Gemma). Our results show that PSA achieves DP guarantees for LLM alignment with minimal loss in performance, including alignment metrics, open-ended text generation quality, and general-purpose reasoning. We also develop the first Membership Inference Attack (MIA) for evaluating and auditing the empirical privacy for the problem of LLM steering via activation editing. Our attack is tailored for activation editing and relies solely on the generated texts without their associated probabilities. Our experiments support the theoretical guarantees by showing improved guarantees for our \textit{PSA} algorithm compared to several existing non-private techniques.

Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal
http://arxiv.org/abs/2501.18532

31.01.2025 04:33 👍 2 🔁 1 💬 0 📌 0

DDSA PhD Fellowship Call 2025 | DDSA

PhD call in Denmark. Applications open!

28.01.2025 12:53 👍 3 🔁 1 💬 0 📌 0

#ICLR
»Differentially Private Steering for Large Language Model Alignment« by @anmolgoel.bsky.social, Yaxi Hu, Iryna Gurevych (@igurevych.bsky.social) & Amartya Sanyal (@amartyasanyal.bsky.social)

(2/🧵)

27.01.2025 11:03 👍 6 🔁 3 💬 1 📌 0

Thank you!

25.01.2025 17:03 👍 1 🔁 0 💬 0 📌 0

Thanks Christoph!

25.01.2025 10:51 👍 0 🔁 0 💬 0 📌 0

Thank you!!

24.01.2025 23:14 👍 0 🔁 0 💬 0 📌 0

Millions in funding for young researchers Nineteen promising researchers working within the technical and natural sciences have received funding of DKK 150 million for their research projects.

I was very lucky and happy to be awarded the Villum Young Investigator grant yesterday villumfonden.dk/en/news/mill...

Looking forward to the resulting research in unlearning, privacy, and online learning supported by the Villum foundation.

(Hiring motivated PhDs and postdocs, especially postdocs)

24.01.2025 23:12 👍 17 🔁 0 💬 4 📌 0

@ccanonne.bsky.social : Steak holders

14.01.2025 23:50 👍 2 🔁 0 💬 0 📌 0

And I think a similar argument holds for synthetic data.

Synthetic data algorithms that don't provably account for privacy probably doesn't provide privacy.

But there are private synthetic data generation algorithms that do like @gautamkamath.com linked above.

21.12.2024 18:38 👍 2 🔁 0 💬 0 📌 0

The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy This paper proves that an "old dog", namely -- the classical Johnson-Lindenstrauss transform, "performs new tricks" -- it gives a novel way of preserving differential privacy. We show that if we take ...

Transformations like JL can indeed preserve privacy (arxiv.org/abs/1204.2136), while others may lead to (quantifiable) privacy degradation (arxiv.org/abs/2403.13041).

The point is perhaps that augmentations, by themselves, don’t inherently guarantee an increase or decrease in privacy.

21.12.2024 18:32 👍 3 🔁 0 💬 1 📌 0

I’ll be at #NeurIPS2024 this week! Looking forward to presenting my joint work with Thomas Steinke(@stein.ke) and Jon Ullman(@thejonullman.bsky.social)

NeurIPS page with video: neurips.cc/virtual/2024...

Link to arxiv: arxiv.org/abs/2406.07407

11.12.2024 12:22 👍 10 🔁 3 💬 0 📌 0

Excited to present at #NeurIPS2024 our work on robust mixture learning!

How hard is mixture learning when (a lot of) outliers are present? We show that it's easier than it seems!

Join us at the poster session (Wed, 16:30 PT, West Ballroom A-D #5710).

10.12.2024 20:31 👍 3 🔁 2 💬 1 📌 0

Postdoc in Privacy and Robustness of Machine Learning Algorithms

Graduating with a PhD related to privacy and robustness in machine learning? Apply to this post-doc opening by @amartyasanyal.bsky.social: employment.ku.dk/faculty/?sho...

03.12.2024 09:57 👍 2 🔁 1 💬 0 📌 0

Two weeks remaining to apply to this position.

I'll also be at NeurIPS, if you want to chat you can DM or email me. :)

02.12.2024 10:05 👍 4 🔁 2 💬 0 📌 0

Three long research meetings throughout the day with four brilliant collaborators.

It was a good day.

27.11.2024 18:46 👍 4 🔁 0 💬 0 📌 0

Done

27.11.2024 15:39 👍 1 🔁 0 💬 0 📌 0

On the TCS job market: Hao Wu!

Hao Wu's research interests focus on both the theoretical and practical aspects of differentially private data analysis. He is actively pursuing opportunities in academia and industry.

1/2 #TCSSky #AcademicJobMarket

25.11.2024 20:32 👍 5 🔁 2 💬 1 📌 2

Sixth AAAI Workshop on Privacy-Preserving Artificial Intelligence

🆘Help needed!

Are you working on Privacy (from a Technical (e.g., Differential Privacy), Policy, or Law perspective)?

Please give your availability to review for PPAI (ppai-workshop.github.io) if you can!
We'd highly appreciate it! 🙏

forms.gle/dqjVsBsR2y81...

23.11.2024 00:47 👍 5 🔁 9 💬 0 📌 0

I made a starter pack for european researchers interested in some aspects of learning theory. The list is clearly inexhaustive. So please enter your suggestions in comments.

go.bsky.app/5o5uVnr

21.11.2024 10:31 👍 12 🔁 2 💬 1 📌 0

Would love to be added as well if possible

19.11.2024 11:47 👍 1 🔁 0 💬 1 📌 0

Open Postdoctoral position in Privacy, unlearning, and Robustness in Machine Learning in University of Copenhagen to work with me!

Deadline: December 15th

If you want to spend a couple of years working on these exciting topics in beautiful Copenhagen, Apply here employment.ku.dk/all-vacancies/…

19.11.2024 08:29 👍 14 🔁 5 💬 0 📌 1

Amartya Sanyal

Latest posts by Amartya Sanyal @amartyasanyal