Findings:
❓ Privacy-preservation at inference-time is really underexplored!
🔍 LMs struggle to prevent PII leakage in their summaries.
👩⚖️ Human evaluations reveal privacy risks that metrics may overlook.
Paper w/ @naletras.bsky.social and Ning Ma
Cc. @sltcdt.bsky.social
22.12.2024 21:41
👍 5
🔁 1
💬 0
📌 0
We tested across 5 LMs (both open & closed) and 3 domains. We analyzed both prompting and fine-tuning techniques to guide LMs toward safer summaries. Summarization datasets from medicine, legal, and general domains were used to measure how much PII leaks.
22.12.2024 21:41
👍 2
🔁 0
💬 1
📌 0
How Private are Language Models in Abstractive Summarization?
Language models (LMs) have shown outstanding performance in text summarization including sensitive domains such as medicine and law. In these settings, it is important that personally identifying info...
Wrote up my first piece of PhD work last week! 🧵
Summarization via LMs is great at extracting info from documents, but how does summarization look in sensitive settings where privacy-preservation is essential?
Short answer: LMs are poor privacy preservers.
Arxiv: arxiv.org/abs/2412.12040
22.12.2024 21:41
👍 7
🔁 3
💬 1
📌 0