Sauvik Das's Avatar

Sauvik Das

@sauvik.me

I work on human-centered {security|privacy|computing}. Associate Professor (w/o tenure) at @hcii.cmu.edu. Director of the SPUD (Security, Privacy, Usability, and Design) Lab. Non-Resident Fellow @cendemtech.bsky.social

467
Followers
114
Following
162
Posts
18.11.2024
Joined
Posts Following

Latest posts by Sauvik Das @sauvik.me

Preview
CBP Tapped Into the Online Advertising Ecosystem To Track Peoples’ Movements An internal DHS document obtained by 404 Media shows for the first time CBP used location data sourced from the online advertising industry to track phone locations. ICE has bought access to similar t...

SCOOP: An internal DHS document obtained by 404 Media shows for the first time CBP used location data sourced from the online advertising industry to track phone locations.

This surveillance can happen through all sorts of apps, such as video games, news apps, weather trackers, and dating apps.

03.03.2026 14:16 πŸ‘ 2143 πŸ” 1421 πŸ’¬ 59 πŸ“Œ 166

This is functionally an end-run around judicial oversight and due process, transforming consumer data into a tool for tracking and enforcement while inevitably sweeping in non-targets, including U.S. citizens and lawful residents.

02.03.2026 15:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Data collected for advertising should not be repurposed for government surveillance or tracking. This secondary use of personal data violates contextual integrity and breaches basic data minimization by turning commercially gathered information into a general investigate resource

02.03.2026 15:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

In January, ICE/DHS put out an RFI on how companies with β€œcommercial Big Data and Ad Tech” products can β€œdirectly support investigations activities.”

This effort would go against the best practice of minimizing data collection as a safeguard against misuse.

02.03.2026 15:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Honored to be named Privacy co-chair of ACM's U.S. Technology Policy Committee with @benwinters.bsky.social

On that note: please read USTPC's statement discouraging adtech vendors from sharing personal data with DHS/ICE.

02.03.2026 15:32 πŸ‘ 4 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Can LLMs really serve as "crash dummies" for security & privacy testing? We put this assumption to the test.

🚨New preprint 🚨: "How Well Can LLM Agents Simulate End-User Security and Privacy Attitudes and Behaviors?"

πŸ‘‡ THREAD πŸ‘‡
[Link to paper: arxiv.org/abs/2602.184...
[1/n]

25.02.2026 17:47 πŸ‘ 4 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

New statement from the ACM U.S. Technology Policy Committee on AdTech & DHS privacy/security: computing experts urge stronger safeguards around government use of advertising-tech data collection. #Privacy #Security #TechPolicy

acm.org/binaries/con...

24.02.2026 21:33 πŸ‘ 3 πŸ” 2 πŸ’¬ 0 πŸ“Œ 2
Video thumbnail

🚨New paper🚨

We spent a year working with emergency preparedness policymakers to answer a simple question: can LLM agent simulations actually help real institutions make better decisions?
The answer is yesβ€”but perhaps not how you'd expect.

πŸ‘‡ THREAD πŸ‘‡
[Link to paper: arxiv.org/abs/2509.218...
[1/n]

13.02.2026 18:42 πŸ‘ 1 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

In short: Quantifying privacy risks can help users make more informed decisionsβ€”but the UX needs to present risks in a manner that is interpretable and actionable to truly *empower* users, rather than scare them.

Thanks @NSF for supporting this work!

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image

(1) Pair risk flags with actionable guidance (how to preserve intent, reduce risk)
(2) Explain plausible attacker exploits (not just β€œrisk: high”)
(3) Communicate risk without pushing unnecessary self-censorship
(4) Use intuitive language/visuals; avoid jargon

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Interestingly, no single UI for presenting PREs to users β€œwon”.

Participants didn’t show a strong overall preference across the five designs (though β€œrisk by disclosure” tended to be liked more; the meter less).

So what *should* PRE designs do? 4 design recommendations:

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

…but sometimes PREs encouraged self-censorship.

A meaningful chunk of reflections ended with deleting the post, not posting at all, or even leaving the platform.

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Finding #2: PREs drove action (often good!).
In 66% of reflections, participants envisioned the user editing the post.

Most commonly: β€œevasive but still expressive” edits (change details, generalize, remove a pinpoint).

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Finding #1: PREs often *shifted perspective*.
In ~74% of reflections, participants expected higher privacy awareness / risk concern.

…but awareness came with emotional costs.
Many participants anticipated anxiety, frustration, or feeling stuck about trade-offs.

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

The 5 concepts ranged from:

(1) raw k-anonymity score
(2) a re-identifiability β€œmeter”
(3) low/med/high simplified risk
(4) threat-specific risk
(5) β€œrisk by disclosure” (which details contribute most)

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Method: speculative design + design fictions.

We storyboarded 5 PRE UI concepts using comic-boards (different ways to show risk + what’s driving it).

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

The core design question:

How should PREs be presented so they help people make better disclosure decisions… *without* nudging them into unnecessary self-censorship?

We don't want people to stop posting β€”Β we want them to make informed disclosure decisions accounting for risks.

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

This paper explores how to present β€œpopulation risk estimates” (PREs): an AI-driven estimate of how uniquely identifiable you are based on your disclosures.

Smaller β€œk” means you're more identifiable (e.g., k=1 means only 1 person matches everything you have disclosed)

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

This paper is the latest of a productive collaboration between my lab, @cocoweixu, and @alan_ritter.

ACL'24 -> a SOTA self-disclosure detection model
CSCW'25 -> a human-AI collaboration study of disclosure risk mitigation
NeurIPS'25 -> a method to quantify self-disclosure risk

10.02.2026 18:07 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

πŸ“£ New at #CHI2026
People share sensitive things β€œanonymously”… but anonymity is hard to reason about.

What if we could quantify re-identification risk with AI? How should we present those AI-estimated risks to users?

Led by my student Isadora Krsek

Paper: www.sauvik.me/papers/70/s...

10.02.2026 18:07 πŸ‘ 1 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image

Check out the paper! It's one of the coolest papers from my lab in that includes both a fully working system *and* a very comprehensive mixed-methods evaluation. Still had a reviewer that wanted even more, but c'est la vie πŸ˜‚

www.sauvik.me/papers/69/s...

Thank for the support @NSF!

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Thus, even though LLM assistance improved outputs, it also raised practitioner-expectations of what the AI would handle for them and made the manual work they *did* have to do feel extra burdensome. A stark design tension for the future of AI-assisted work.

09.02.2026 19:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

A surprising aside: we added a number of design frictions to Privy-LLM to encourage critical thinking. As a result, some practitioners rated Privy-LLM as being *less helpful* than those who used just the static template (where they had to do much more of the work manually).

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Key detail: experts also rated the LLM-condition mitigations as especially good β€œconversation starters”—i.e., credible enough to bring to a product team and use to kick off real mitigation planning.

This could help bring privacy and product teams closer together.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

But outputs from the LLM-supported version was rated higher quality overall: clearer, more correct, more relevant/severe risks; and mitigation plans that experts saw as more effective and more product-specific.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Both versions enabled practitioners to produce strong privacy impact assessments (as judged by experts). So the scaffolding itself mattered irrespective of the AI-support provided.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We recruited 24 industry practitioners to use one of the two versions (between-subjects). Their assessments were then rated by 13 independent privacy experts across multiple quality dimensions.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

We made two versions:
A) an interactive LLM-assisted Privy (w/ intention design friction to encourage critical thinking)
B) a structured worksheet modeled after existing PIAs

Same underlying workflowβ€”one with AI support and one without.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

Privy then helps folks:
β€’ articulate how each risk could show up in this specific product
β€’ prioritize what is most relevant and severe
β€’ draft mitigations that protect people without flattening the feature’s utility.

09.02.2026 19:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

In a vacuum, it's hard to answer how a product will lead to privacy risks.

Privy guides folks through a workflow to articulate:
β€’ who uses the product + who’s affected
β€’ what the AI can do
β€’ what data it needs / produces

β†’ then maps that to the AI privacy taxonomy.

09.02.2026 19:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0