Harry Cheon's Avatar

Harry Cheon

@scheon.com

"Seung Hyun" | MS CS & BS Applied Math @UCSD 🌊 | LPCUWC 18' πŸ‡­πŸ‡° | AI Evaluation, Safety, Alignment | πŸ‡°πŸ‡· harry.scheon.com

328
Followers
32
Following
7
Posts
15.11.2024
Joined
Posts Following

Latest posts by Harry Cheon @scheon.com

Preview
The Actuary's Final Word on Algorithmic Decision Making Paul Meehl's foundational work "Clinical versus Statistical Prediction," provided early theoretical justification and empirical evidence of the superiority of statistical methods over clinical judgmen...

In a new paper, I try to resolve the counterintuitive evidence of Meehl’s β€œclinical vs statistical prediction” problems: Statistics only wins because the game is rigged.

08.09.2025 14:48 πŸ‘ 29 πŸ” 9 πŸ’¬ 3 πŸ“Œ 0

When RAG systems hallucinate, is the LLM misusing available information or is the retrieved context insufficient? In our #ICLR2025 paper, we introduce "sufficient context" to disentangle these failure modes. Work w Jianyi Zhang, Chun-Sung Ferng, Da-Cheng Juan, Ankur Taly, @cyroid.bsky.social

24.04.2025 18:18 πŸ‘ 11 πŸ” 5 πŸ’¬ 1 πŸ“Œ 0

Hey AI folks - stop using SHAP! It won't help you debug [1], won't catch discrimination [2], and makes no sense for feature importance [3].

Plus - as we show - it also won't give recourse.

In a paper at #ICLR we introduce feature responsiveness scores... 1/

arxiv.org/pdf/2410.22598

24.04.2025 16:37 πŸ‘ 29 πŸ” 8 πŸ’¬ 3 πŸ“Œ 0
Post image

Many ML models predict labels that don’t reflect what we care about, e.g.:
– Diagnoses from unreliable tests
– Outcomes from noisy electronic health records

In a new paper w/@berkustun, we study how this subjects individuals to a lottery of mistakes.
Paper: bit.ly/3Y673uZ
πŸ§΅πŸ‘‡

19.04.2025 23:04 πŸ‘ 12 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

We'll be @ ICLR!

Poster: Sat 26 Apr 10AM β€” 12:30PM SGT

Paper: tinyurl.com/2deek4wx
Code: tinyurl.com/2rb6zc28

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

We develop methods to compute responsiveness scores for any dataset and models. Three main advantages:
1. Can be swapped in place of existing methods
2. Highlight responsive features
3. Flag instances where such features don't exist!

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image

Current approaches are unable to inform consumers when:
1. features are not responsive
2. features are not monotonically responsive (e.g., can't increase income "too much")
3. features must change in counterintuitive ways (e.g., decrease income) to obtain the desired prediction

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image

But, SHAP highlights features that are:
1. Immutable: HistoryOfLatePayment
2. Mutable but not actionable: Age, NumberOfDependents
3. Actionable but not responsive: CreditUtilization

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image Post image

Hence, we designed responsiveness scores to highlight features that are actionable and responsive (i.e., lead to desired prediction when changed)

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Many countries seek to protect consumers in applications like lending and hiring by requiring explanations for adverse outcomes. But,
- Many provide companies with substantial flexibility
- Standard approach is to use methods like SHAP and LIME to highlight important features

24.04.2025 06:19 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Post image

Denied a loan, an interview, or an insurance claim by machine learning models? You may be entitled to a list of reasons.

In our latest w @anniewernerfelt.bsky.social @berkustun.bsky.social @friedler.net, we show how existing explanation frameworks fail and present an alternative for recourse

24.04.2025 06:19 πŸ‘ 16 πŸ” 7 πŸ’¬ 1 πŸ“Œ 1