A classifier has an error of 0.15 and unfairness violation of 0.13, while the same classifier, on data with 6 relabeled samples, have the the same error but 0.03 fairness violation.
In our new work we ask: Can end-users make a platformโs ML models fairer?
Firm-side fair learning often reduces accuracy, discouraging firms from using it. But if a platform relies on user data, can minority users collectively change the data to induce fairness?
(1/4)
22.08.2025 06:45
๐ 1
๐ 2
๐ฌ 1
๐ 0
An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
Johanna Dรผngler, Amartya Sanyal
http://arxiv.org/abs/2508.10879
Given $n$ i.i.d. random matrices $A_i \in \mathbb{R}^{d \times d}$ that share
a common expectation $\Sigma$, the objective of Differentially Private
Stochastic PCA is to identify a subspace of dimension $k$ that captures the
largest variance directions of $\Sigma$, while preserving differential privacy
(DP) of each individual $A_i$. Existing methods either (i) require the sample
size $n$ to scale super-linearly with dimension $d$, even under Gaussian
assumptions on the $A_i$, or (ii) introduce excessive noise for DP even when
the intrinsic randomness within $A_i$ is small. Liu et al. (2022a) addressed
these issues for sub-Gaussian data but only for estimating the top eigenvector
($k=1$) using their algorithm DP-PCA. We propose the first algorithm capable of
estimating the top $k$ eigenvectors for arbitrary $k \leq d$, whilst overcoming
both limitations above. For $k=1$ our algorithm matches the utility guarantees
of DP-PCA, achieving near-optimal statistical error even when $n =
\tilde{\!O}(d)$. We further provide a lower bound for general $k > 1$, matching
our upper bound up to a factor of $k$, and experimentally demonstrate the
advantages of our algorithm over comparable baselines.
An Iterative Algorithm for Differentially Private $k$-PCA with Adaptive Noise
Johanna Dรผngler, Amartya Sanyal
http://arxiv.org/abs/2508.10879
15.08.2025 03:50
๐ 3
๐ 1
๐ฌ 0
๐ 0
Technically, we
1โฃ formalise the online learning unlearning (OLU) problem setting
2โฃ propose two styles of OLU algorithms
3โฃ In the Online Cvx Optimisation (OCO) framework, we nearly match the Regret guarantees of standard OCO without unlearning
15.05.2025 15:47
๐ 1
๐ 0
๐ฌ 0
๐ 0
๐จNew Paper: Online Learning and Unlearning ๐จ
We look at learning and unlearning in the online setting where both learning and unlearning requests arrive continuously over time.
Lead by @yaxihu.bsky.social and joint work with Bernhard Schรถlkopf
arxiv.org/abs/2505.08557
15.05.2025 15:46
๐ 7
๐ 1
๐ฌ 1
๐ 0
๐๐ฎ๐ป ๐๐ฒ ๐ฎ๐น๐ถ๐ด๐ป ๐๐๐ ๐ ๐๐ผ๐๐ฎ๐ฟ๐ฑ๐ ๐ฎ ๐ฑ๐ฒ๐๐ถ๐ฟ๐ฒ๐ฑ ๐ฏ๐ฒ๐ต๐ฎ๐๐ถ๐ผ๐๐ฟ ๐๐ต๐ถ๐น๐ฒ ๐บ๐ฎ๐ถ๐ป๐๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐ฑ๐ถ๐ณ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐๐ถ๐ฎ๐น ๐ฝ๐ฟ๐ถ๐๐ฎ๐ฐ๐ ๐ด๐๐ฎ๐ฟ๐ฎ๐ป๐๐ฒ๐ฒ๐?
We answer this in our #ICLR2025 paper.
Tl;dr: We propose, evaluate and audit a novel differentially private activation steering algorithm for aligning LLMs.
(1/๐งต)
23.04.2025 09:15
๐ 4
๐ 1
๐ฌ 1
๐ 0
Vacancies
Open postdoc position in learning theory/ privacy/ robustness/ unlearning or any related topics with me and others in University of Copenhagen, Denmark.
If you think you would be a good candidate, send me an email amartya18x.github.io/hiring/
#postdoc
27.03.2025 20:23
๐ 4
๐ 0
๐ฌ 0
๐ 0
IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), April 9-11, 2025, Copenhagen.
No plans for April 9โ11 yet? โ Why not spend an amazing week in beautiful Copenhagen ๐ฉ๐ฐ, exploring cutting-edge research on trustworthy machine learning.
Join us at SaTML 2025, the premier conference on AI security, AI privacy, and AI fairness!
๐ satml.org/attend
@satml.org
03.03.2025 15:03
๐ 7
๐ 4
๐ฌ 0
๐ 0
Very shortly at @realaaai.bsky.social, @alext2.bsky.social and I will be giving a Tutorial on the impact of Quality and availability of labels and data for Privacy, Fairness, and Robustness of ML algorithms
See here amartya18x.github.io/files/Tutori...
@ucph.bsky.social
25.02.2025 12:27
๐ 4
๐ 1
๐ฌ 0
๐ 1
IEEE SaTML
IEEE Conference on Secure and Trustworthy Machine Learning
3rd IEEE Conference on Secure and Trustworthy Machine Learning
University of Copenhagen, Denmark, April 9-11, 2025 - registration is open. satml.org
@amartyasanyal.bsky.social
22.02.2025 16:43
๐ 7
๐ 2
๐ฌ 0
๐ 0
Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal
http://arxiv.org/abs/2501.18532
Aligning Large Language Models (LLMs) with human values and away from
undesirable behaviors (such as hallucination) has become increasingly
important. Recently, steering LLMs towards a desired behavior via activation
editing has emerged as an effective method to mitigate harmful generations at
inference-time. Activation editing modifies LLM representations by preserving
information from positive demonstrations (e.g., truthful) and minimising
information from negative demonstrations (e.g., hallucinations). When these
demonstrations come from a private dataset, the aligned LLM may leak private
information contained in those private samples. In this work, we present the
first study of aligning LLM behavior with private datasets. Our work proposes
the \textit{\underline{P}rivate \underline{S}teering for LLM
\underline{A}lignment (PSA)} algorithm to edit LLM activations with
differential privacy (DP) guarantees. We conduct extensive experiments on seven
different benchmarks with open-source LLMs of different sizes (0.5B to 7B) and
model families (LlaMa, Qwen, Mistral and Gemma). Our results show that PSA
achieves DP guarantees for LLM alignment with minimal loss in performance,
including alignment metrics, open-ended text generation quality, and
general-purpose reasoning. We also develop the first Membership Inference
Attack (MIA) for evaluating and auditing the empirical privacy for the problem
of LLM steering via activation editing. Our attack is tailored for activation
editing and relies solely on the generated texts without their associated
probabilities. Our experiments support the theoretical guarantees by showing
improved guarantees for our \textit{PSA} algorithm compared to several existing
non-private techniques.
Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych, Amartya Sanyal
http://arxiv.org/abs/2501.18532
31.01.2025 04:33
๐ 2
๐ 1
๐ฌ 0
๐ 0
DDSA PhD Fellowship Call 2025 | DDSA
PhD call in Denmark. Applications open!
28.01.2025 12:53
๐ 3
๐ 1
๐ฌ 0
๐ 0
#ICLR
ยปDifferentially Private Steering for Large Language Model Alignmentยซ by @anmolgoel.bsky.social, Yaxi Hu, Iryna Gurevych (@igurevych.bsky.social) & Amartya Sanyal (@amartyasanyal.bsky.social)
(2/๐งต)
27.01.2025 11:03
๐ 6
๐ 3
๐ฌ 1
๐ 0
Thank you!
25.01.2025 17:03
๐ 1
๐ 0
๐ฌ 0
๐ 0
Thanks Christoph!
25.01.2025 10:51
๐ 0
๐ 0
๐ฌ 0
๐ 0
Thank you!!
24.01.2025 23:14
๐ 0
๐ 0
๐ฌ 0
๐ 0
Millions in funding for young researchers
Nineteen promising researchers working within the technical and natural sciences have received funding of DKK 150 million for their research projects.
I was very lucky and happy to be awarded the Villum Young Investigator grant yesterday villumfonden.dk/en/news/mill...
Looking forward to the resulting research in unlearning, privacy, and online learning supported by the Villum foundation.
(Hiring motivated PhDs and postdocs, especially postdocs)
24.01.2025 23:12
๐ 17
๐ 0
๐ฌ 4
๐ 0
@ccanonne.bsky.social : Steak holders
14.01.2025 23:50
๐ 2
๐ 0
๐ฌ 0
๐ 0
And I think a similar argument holds for synthetic data.
Synthetic data algorithms that don't provably account for privacy probably doesn't provide privacy.
But there are private synthetic data generation algorithms that do like @gautamkamath.com linked above.
21.12.2024 18:38
๐ 2
๐ 0
๐ฌ 0
๐ 0
Iโll be at #NeurIPS2024 this week! Looking forward to presenting my joint work with Thomas Steinke(@stein.ke) and Jon Ullman(@thejonullman.bsky.social)
NeurIPS page with video: neurips.cc/virtual/2024...
Link to arxiv: arxiv.org/abs/2406.07407
11.12.2024 12:22
๐ 10
๐ 3
๐ฌ 0
๐ 0
Excited to present at #NeurIPS2024 our work on robust mixture learning!
How hard is mixture learning when (a lot of) outliers are present? We show that it's easier than it seems!
Join us at the poster session (Wed, 16:30 PT, West Ballroom A-D #5710).
10.12.2024 20:31
๐ 3
๐ 2
๐ฌ 1
๐ 0
Postdoc in Privacy and Robustness of Machine Learning Algorithms
Graduating with a PhD related to privacy and robustness in machine learning? Apply to this post-doc opening by @amartyasanyal.bsky.social: employment.ku.dk/faculty/?sho...
03.12.2024 09:57
๐ 2
๐ 1
๐ฌ 0
๐ 0
Two weeks remaining to apply to this position.
I'll also be at NeurIPS, if you want to chat you can DM or email me. :)
02.12.2024 10:05
๐ 4
๐ 2
๐ฌ 0
๐ 0
Three long research meetings throughout the day with four brilliant collaborators.
It was a good day.
27.11.2024 18:46
๐ 4
๐ 0
๐ฌ 0
๐ 0
Done
27.11.2024 15:39
๐ 1
๐ 0
๐ฌ 0
๐ 0
On the TCS job market: Hao Wu!
Hao Wu's research interests focus on both the theoretical and practical aspects of differentially private data analysis. He is actively pursuing opportunities in academia and industry.
1/2 #TCSSky #AcademicJobMarket
25.11.2024 20:32
๐ 5
๐ 2
๐ฌ 1
๐ 2
Sixth AAAI Workshop on Privacy-Preserving Artificial Intelligence
๐Help needed!
Are you working on Privacy (from a Technical (e.g., Differential Privacy), Policy, or Law perspective)?
Please give your availability to review for PPAI (ppai-workshop.github.io) if you can!
We'd highly appreciate it! ๐
forms.gle/dqjVsBsR2y81...
23.11.2024 00:47
๐ 5
๐ 9
๐ฌ 0
๐ 0
I made a starter pack for european researchers interested in some aspects of learning theory. The list is clearly inexhaustive. So please enter your suggestions in comments.
go.bsky.app/5o5uVnr
21.11.2024 10:31
๐ 12
๐ 2
๐ฌ 1
๐ 0
Would love to be added as well if possible
19.11.2024 11:47
๐ 1
๐ 0
๐ฌ 1
๐ 0
Open Postdoctoral position in Privacy, unlearning, and Robustness in Machine Learning in University of Copenhagen to work with me!
Deadline: December 15th
If you want to spend a couple of years working on these exciting topics in beautiful Copenhagen, Apply here employment.ku.dk/all-vacancies/โฆ
19.11.2024 08:29
๐ 14
๐ 5
๐ฌ 0
๐ 1