Edoardo Debenedetti @NeurIPS's Avatar

Edoardo Debenedetti @NeurIPS

@edebenedetti

PhD student at ETH Zurich | Student Researcher at Google | Agents Security and more in general ML Security and Privacy edoardo.science spylab.ai

247
Followers
61
Following
2
Posts
21.11.2023
Joined
Posts Following

Latest posts by Edoardo Debenedetti @NeurIPS @edebenedetti

I am at NeurIPS ๐Ÿ‡จ๐Ÿ‡ฆ, please reach out if you want to grab a coffee!

12.12.2024 22:36 ๐Ÿ‘ 4 ๐Ÿ” 2 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

SPY Lab is in Vancouver for NeurIPS! Come say hi if you see us around ๐Ÿ•ต๏ธ

10.12.2024 19:43 ๐Ÿ‘ 10 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1

I'm in Vancouver for NeurIPS! Feel free to reach out if you wanna meet to chat about security and privacy, especially in the context of LLM agents!

10.12.2024 14:59 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Come do open AI with us in Zurich!
We're hiring PhD students, postdocs (and faculty!)

04.12.2024 13:49 ๐Ÿ‘ 11 ๐Ÿ” 3 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1

Feel free to recommend @javirandor.com more researchers to add to the list!

04.12.2024 11:31 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Apropos of today's Overleaf downtime/slowness: remember to have your files backed up on Github or locally! What if this happened on the day of a conference deadline?

03.12.2024 16:14 ๐Ÿ‘ 17 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Anyone may be able to compromise LLMs with malicious content posted online. With just a small amount of data, adversaries can backdoor chatbots to become unusable for RAG, or bias their outputs towards specific beliefs. Check our latest work! ๐Ÿ‘‡๐Ÿงต

25.11.2024 12:27 ๐Ÿ‘ 5 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 1
Preview
Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust Ensemble everything everywhere is a defense to adversarial examples that was recently proposed to make image classifiers robust. This defense works by ensembling a model's intermediate representations...

Ensemble Everything Everywhere is a defense against adversarial examples that people got quite exited about a few months ago (in particular, the defense causes "perceptually aligned" gradients just like adversarial training)

Unfortunately, we show it's not robust...

arxiv.org/abs/2411.14834

25.11.2024 08:38 ๐Ÿ‘ 28 ๐Ÿ” 9 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0