George Kaissis's Avatar

George Kaissis

@g-k.ai

Professor for Human-Centred Transformative AI @ Hasso-Plattner-Institut. Previously @ Google DeepMind, Imperial College London, TU Munich. πŸ‡ͺπŸ‡Ί πŸ³οΈβ€πŸŒˆ https://www.g-k.ai

68
Followers
74
Following
14
Posts
16.02.2025
Joined
Posts Following

Latest posts by George Kaissis @g-k.ai

A promotional graphic for an oral presentation at the EACL 2026 conference in Morocco. The background features a sunny, historic Moroccan stone fortress gate with palm trees, a clear blue sky, and decorative geometric tile patterns in the corners. Text in the top left indicates the event is at Palais Des Congres, Rabat, from March 24-29, 2026. A banner across the middle displays the presentation title: "Unintended Memorization of Sensitive Information in Fine-Tuned Language Models." Below the title is a flowchart diagram illustrating how Large Language Models (LLMs) trained on sensitive medical text can inadvertently memorize Personally Identifiable Information (PII), and how a "True-Prefix Attack" can extract a patient's name even when fine-tuned for downstream tasks that do not contain PII. Text at the very bottom reads, "Oral Presentation: March 27 | 11:00 AM | Salle La Palmeraie."

A promotional graphic for an oral presentation at the EACL 2026 conference in Morocco. The background features a sunny, historic Moroccan stone fortress gate with palm trees, a clear blue sky, and decorative geometric tile patterns in the corners. Text in the top left indicates the event is at Palais Des Congres, Rabat, from March 24-29, 2026. A banner across the middle displays the presentation title: "Unintended Memorization of Sensitive Information in Fine-Tuned Language Models." Below the title is a flowchart diagram illustrating how Large Language Models (LLMs) trained on sensitive medical text can inadvertently memorize Personally Identifiable Information (PII), and how a "True-Prefix Attack" can extract a patient's name even when fine-tuned for downstream tasks that do not contain PII. Text at the very bottom reads, "Oral Presentation: March 27 | 11:00 AM | Salle La Palmeraie."

Thrilled to present our paper "Unintended Memorization of Sensitive Information in Fine-Tuned Language Models" at #EACL2026 in Rabat! πŸ‡²πŸ‡¦

w/ J. Marin Ruiz, G. Kaissis, P. Seidl, R. v. Eisenhart-Rothe, F. Hinterwimmer & @danielrueckert.bsky.social.

Read here: arxiv.org/abs/2601.174...

10.03.2026 11:08 πŸ‘ 3 πŸ” 2 πŸ’¬ 2 πŸ“Œ 0
Optimal conversion from RΓ©nyi Differential Privacy to $f$-Differential Privacy

Anneliese Riess, Juan Felipe Gomez, Flavio du Pin Calmon, Julia Anne Schnabel, Georgios Kaissis

http://arxiv.org/abs/2602.04562

We prove the conjecture stated in Appendix F.3 of [Zhu et al. (2022)]: among all conversion rules that map a RΓ©nyi Differential Privacy (RDP) profile $Ο„\mapsto ρ(Ο„)$ to a valid hypothesis-testing trade-off $f$, the rule based on the intersection of single-order RDP privacy regions is optimal. This optimality holds simultaneously for all valid RDP profiles and for all Type I error levels $Ξ±$. Concretely, we show that in the space of trade-off functions, the tightest possible bound is $f_{ρ(\cdot)}(Ξ±) = \sup_{Ο„\geq 0.5} f_{Ο„,ρ(Ο„)}(Ξ±)$: the pointwise maximum of the single-order bounds for each RDP privacy region. Our proof unifies and sharpens the insights of [Balle et al. (2019)], [Asoodeh et al. (2021)], and [Zhu et al. (2022)]. Our analysis relies on a precise geometric characterization of the RDP privacy region, leveraging its convexity and the fact that its boundary is determined exclusively by Bernoulli mechanisms. Our results establish that the "intersection-of-RDP-privacy-regions" rule is not only valid, but optimal: no other black-box conversion can uniformly dominate it in the Blackwell sense, marking the fundamental limit of what can be inferred about a mechanism's privacy solely from its RDP guarantees.

Optimal conversion from RΓ©nyi Differential Privacy to $f$-Differential Privacy Anneliese Riess, Juan Felipe Gomez, Flavio du Pin Calmon, Julia Anne Schnabel, Georgios Kaissis http://arxiv.org/abs/2602.04562 We prove the conjecture stated in Appendix F.3 of [Zhu et al. (2022)]: among all conversion rules that map a RΓ©nyi Differential Privacy (RDP) profile $Ο„\mapsto ρ(Ο„)$ to a valid hypothesis-testing trade-off $f$, the rule based on the intersection of single-order RDP privacy regions is optimal. This optimality holds simultaneously for all valid RDP profiles and for all Type I error levels $Ξ±$. Concretely, we show that in the space of trade-off functions, the tightest possible bound is $f_{ρ(\cdot)}(Ξ±) = \sup_{Ο„\geq 0.5} f_{Ο„,ρ(Ο„)}(Ξ±)$: the pointwise maximum of the single-order bounds for each RDP privacy region. Our proof unifies and sharpens the insights of [Balle et al. (2019)], [Asoodeh et al. (2021)], and [Zhu et al. (2022)]. Our analysis relies on a precise geometric characterization of the RDP privacy region, leveraging its convexity and the fact that its boundary is determined exclusively by Bernoulli mechanisms. Our results establish that the "intersection-of-RDP-privacy-regions" rule is not only valid, but optimal: no other black-box conversion can uniformly dominate it in the Blackwell sense, marking the fundamental limit of what can be inferred about a mechanism's privacy solely from its RDP guarantees.

Optimal conversion from RΓ©nyi Differential Privacy to $f$-Differential Privacy

Anneliese Riess, Juan Felipe Gomez, Flavio du Pin Calmon, Julia Anne Schnabel, Georgios Kaissis

http://arxiv.org/abs/2602.04562

05.02.2026 04:53 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Screenshot of the paper title with authors listed

Screenshot of the paper title with authors listed

We extracted (parts of) 12 books in experiments with 4 frontier-lab, production LLMs.

We prompted the LLMs with a short prefix of a book and asked them to complete the rest. For Harry Potter and the Sorcerer’s Stone, we extracted 95.8% of the book from jailbroken Claude 3.7 Sonnet.

07.01.2026 20:31 πŸ‘ 103 πŸ” 37 πŸ’¬ 6 πŸ“Œ 13

Let’s today not only commemorate Rosalind Franklin, but also Raymond Gosling, her @kingscollegelondon.bsky.social postgraduate who took photo 51…

08.11.2025 17:52 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

I am overjoyed to announce that I have joined HPI as a full professor for Human-Centred Transformative AI. I am looking forward to working with my amazing new and current collaborators and would like to deeply thank everyone who has been part of the journey that led me here! @hpi.bsky.social

04.11.2025 06:04 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Neue Professur fΓΌr #DigitalHealth: Prof. Dr. med. Georg Kaissis

Neue Professur fΓΌr #DigitalHealth: Prof. Dr. med. Georg Kaissis

Zum 1. November trat Prof. Dr. med. Georg Kaissis seine Professur fΓΌr #DigitalHealth: Human-Centred Transformative AI an. Sein Forschungsschwerpunkt: die Entwicklung der nΓ€chsten Generation multimodaler #KI-Modelle.
Mehr Infos zur neuen Professur: hpi.de/artikel/prof...

03.11.2025 10:37 πŸ‘ 5 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

We celebrated the 5th anniversary of our research chair at @tum.de! πŸ’™πŸ₯‚

It's been an incredible journey of research and collaboration. Thank you to everyone who has made this possible. We are very much looking forward to the next years to come!

#AIMAnniversary #AIMNews

03.11.2025 12:27 πŸ‘ 5 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
VaultGemma: A Differentially Private Gemma Model

Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi KumarAmer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi Kumar

http://arxiv.org/abs/2510.15001

We introduce VaultGemma 1B, a 1 billion parameter model within the Gemma
family, fully trained with differential privacy. Pretrained on the identical
data mixture used for the Gemma 2 series, VaultGemma 1B represents a
significant step forward in privacy-preserving large language models. We openly
release this model to the community

VaultGemma: A Differentially Private Gemma Model Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi KumarAmer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan Zhang, Badih Ghazi, Borja De Balle Pigem, Prem Eruvbetine, Tris Warkentin, Armand Joulin, Ravi Kumar http://arxiv.org/abs/2510.15001 We introduce VaultGemma 1B, a 1 billion parameter model within the Gemma family, fully trained with differential privacy. Pretrained on the identical data mixture used for the Gemma 2 series, VaultGemma 1B represents a significant step forward in privacy-preserving large language models. We openly release this model to the community

VaultGemma: A Differentially Private Gemma Model

Amer Sinha, Thomas Mesnard, Ryan McKenna, Daogao Liu, Christopher A. Choquette-Choo, Yangsibo Huang, Da Yu, George Kaissis, Zachary Charles, Ruibo Liu, Lynn Chua, Pritish Kamath, Pasin Manurangsi, Steve He, Chiyuan ...
http://arxiv.org/abs/2510.15001

20.10.2025 03:48 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
VaultGemma: The world's most capable differentially private LLM

It has been a privilege to work with so many amazing colleagues across Google and Google DeepMind to build VaultGemma, an LLM trained from scratch with Differential Privacy. Weights are openly available! Check it out here: research.google/blog/vaultge...

17.09.2025 17:25 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Congratulations Hugo! Well deserved!

02.09.2025 19:18 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Due to physical resource constraints, we currently estimate that around 300–400 of the candidate papers recommended for acceptance by the ACs will need to be rejected. We seek the support of our 41 SACs in addressing this distributed optimization problem in a fair and professional manner.

28.08.2025 16:12 πŸ‘ 20 πŸ” 3 πŸ’¬ 9 πŸ“Œ 1

NeurIPS has decided to do what ICLR did: As a SAC I received the message πŸ‘‡ This is wrong! If the review process cannot handle so many papers, the conference needs yo split instead of arbitrarily rejecting 400 papers.

28.08.2025 16:12 πŸ‘ 105 πŸ” 17 πŸ’¬ 8 πŸ“Œ 2
SaTML 2026

πŸ“£ Researchers in AI security, privacy & fairness: It's time to share your latest work!

The SaTML 2026 submission site is live πŸ‘‰ hotcrp.satml.org

πŸ—“οΈ Deadline: Sept 24, 2025

@satml.org

27.08.2025 18:22 πŸ‘ 5 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Thanks @aim-lab.bsky.social for having me and thank you to my co-presenter @milddave.bsky.social who led the exercise and @martinmenten.bsky.social and all others for organising!

18.07.2025 17:03 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
eurips.cc A NeurIPS-endorsed conference in Europe held in Copenhagen, Denmark

NeurIPS is endorsing EurIPS, an independently-organized meeting which will offer researchers an opportunity to additionally present NeurIPS work in Europe concurrently with NeurIPS.

Read more in our blog post and on the EurIPS website:
blog.neurips.cc/2025/07/16/n...
eurips.cc

16.07.2025 22:05 πŸ‘ 124 πŸ” 38 πŸ’¬ 2 πŸ“Œ 3

Also, great humans.

16.07.2025 06:20 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Cooper is the best, go chat if you get the chance!

16.07.2025 06:20 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

New preprint on the most precise as of yet mapping between differential privacy and common operational notions of privacy risk used in practice:

10.07.2025 10:05 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
The Hitchhiker's Guide to Efficient, End-to-End, and Tight DP Auditing

Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Georgios Kaissis, Emiliano De Cristofaro

http://arxiv.org/abs/2506.16666

This paper systematizes research on auditing Differential Privacy (DP)
techniques, aiming to identify key insights into the current state of the art
and open challenges. First, we introduce a comprehensive framework for
reviewing work in the field and establish three cross-contextual desiderata
that DP audits should target--namely, efficiency, end-to-end-ness, and
tightness. Then, we systematize the modes of operation of state-of-the-art DP
auditing techniques, including threat models, attacks, and evaluation
functions. This allows us to highlight key details overlooked by prior work,
analyze the limiting factors to achieving the three desiderata, and identify
open research problems. Overall, our work provides a reusable and systematic
methodology geared to assess progress in the field and identify friction points
and future directions for our community to focus on.

The Hitchhiker's Guide to Efficient, End-to-End, and Tight DP Auditing Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Georgios Kaissis, Emiliano De Cristofaro http://arxiv.org/abs/2506.16666 This paper systematizes research on auditing Differential Privacy (DP) techniques, aiming to identify key insights into the current state of the art and open challenges. First, we introduce a comprehensive framework for reviewing work in the field and establish three cross-contextual desiderata that DP audits should target--namely, efficiency, end-to-end-ness, and tightness. Then, we systematize the modes of operation of state-of-the-art DP auditing techniques, including threat models, attacks, and evaluation functions. This allows us to highlight key details overlooked by prior work, analyze the limiting factors to achieving the three desiderata, and identify open research problems. Overall, our work provides a reusable and systematic methodology geared to assess progress in the field and identify friction points and future directions for our community to focus on.

The Hitchhiker's Guide to Efficient, End-to-End, and Tight DP Auditing

Meenatchi Sundaram Muthu Selva Annamalai, Borja Balle, Jamie Hayes, Georgios Kaissis, Emiliano De Cristofaro

http://arxiv.org/abs/2506.16666

23.06.2025 03:48 πŸ‘ 4 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

Responsible medical AI demands patient-level privacy. Our recent #TPDP '25 paper extends Differential Privacy #DP beyond individual data points to protect entire patient profiles. 🧠

πŸ“„ Read the paper: tinyurl.com/k9fz456a
πŸŽ₯ Watch the video: tinyurl.com/2jx528m5

12.06.2025 07:55 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Preview
A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets Supervised contrastive learning (SupCon) has proven to be a powerful alternative to the standard cross-entropy loss for classification of multi-class balanced datasets. However, it struggles to learn ...

Going to be at CVPR the next couple of days presenting our paper β€žA Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasetsβ€œ.

arxiv.org/abs/2503.17024

Always happy to meet anyone working on representation learning or tabular DL and medical data

11.06.2025 00:56 πŸ‘ 7 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0
Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Grafik die auf die Internationale Informatik Olympiade 2027 hinweist. Datum: 12.–19. September 2027

Markiert es euch im Kalender:β€―Die IOI 2027 kommt ans HPI!β€―πŸŽ‰

Zum ersten Mal seit 1992 kommt dieβ€―Internationale Informatik-Olympiadeβ€―(IOI) zurΓΌck nach Deutschland und wir sind in Kooperation mit Bundesweite Informatikwettbewerbe (BWINF) stolze Gastgeber des renommierten Wettbewerbs.

30.05.2025 10:32 πŸ‘ 3 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Preview
Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models State-of-the-art membership inference attacks (MIAs) typically require training many reference models, making it difficult to scale these attacks to large pre-trained language models (LLMs). As a resu...

Check out our new pre-print "Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models", joint work with fantastic colleagues from Google (DeepMind) and many other great institutions! Find it here: arxiv.org/abs/2505.18773

27.05.2025 08:00 πŸ‘ 4 πŸ” 2 πŸ’¬ 0 πŸ“Œ 0
Preview
Redirection for Erasing Memory (REM): Towards a universal unlearning method for corrupted data Machine unlearning is studied for a multitude of tasks, but specialization of unlearning methods to particular tasks has made their systematic comparison challenging. To address this issue, we propose...

Check out our new pre-print "Redirection for Erasing Memory (REM): Towards a universal unlearning method for corrupted data", joint work with excellent colleagues from Google DeepMind, Google Research and University of Cambridge. Check it out here! arxiv.org/abs/2505.17730

26.05.2025 07:17 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Congratulations @danielrueckert.bsky.social, well deserved! You truly are a role model and a most impressive Fellow!

22.05.2025 19:34 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We are incredibly proud of Prof @danielrueckert.bsky.social for being elected as Fellow of the Royal Society! Well deserved and a testament to your dedication to research πŸŽ‰

22.05.2025 11:08 πŸ‘ 8 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

Now, it's time to up the ante.

We are committed to enshrining scientific freedom in EU law, creating a 7-year β€˜super grant’ to attract top researchers, and expanding support for the most promising scientists.

More β†’ europa.eu/!TTbWbJ

14.05.2025 07:02 πŸ‘ 78 πŸ” 18 πŸ’¬ 3 πŸ“Œ 6
Post image

Data attribution is crucial for debugging models and detecting low quality data (spotting mislabeled samples, bias etc.).
But many methods aren't mathematically sound and don’t scale.

But how could we improve this for large models?
1/n

20.04.2025 08:48 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Congratulations!

09.04.2025 15:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

Excited to share: β€œA Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets” has been accepted to #CVPR2025! πŸŽ‰

Paper: lnkd.in/esKRqF5p
Code: lnkd.in/eZFvDA5Q

(Thread incoming πŸ‘‡)

01.04.2025 13:37 πŸ‘ 12 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0