Nathaniel Haines (@natehaines)

i do empathize with reactions against "AI as art creator" types of applications, though. there are certain things that need human involvement to be considered meaningful, even if at the surface they are hard to distinguish

05.03.2026 17:07 👍 3 🔁 0 💬 1 📌 0

yeah there is a real risk of falling very behind if you reflexively decide against using LLMs these days (at least if your work involves writing lots of code)

05.03.2026 17:04 👍 1 🔁 0 💬 1 📌 0

Thanks! If I ever follow up on the blog I'll def take a look

03.02.2026 02:21 👍 0 🔁 0 💬 0 📌 0

tbf my earliest blog trilogy is on modeling multi-armed bandit tasks like the IGT. e.g see below along with the next 2 in the sequence:

haines-lab.com/post/2017-04...

02.02.2026 16:03 👍 2 🔁 0 💬 0 📌 0

Ha maybe that will be the next one! Given I just got the IAT one out, and my pacing of one blog every 1-2 years, might be a while 😅

02.02.2026 15:53 👍 3 🔁 0 💬 1 📌 0

> that's for the community to pursue

I think this is the crux of the issue—there's not a single "community" using the IGT. Math psych/cognitive science folks are operating under a distinct set of standards + measurement practices, which I believe is obscured by treating the lit monolithically

02.02.2026 15:38 👍 2 🔁 0 💬 0 📌 0

Yes totally makes sense given the goal! My point in general is that people often discount an entire literature based on results like this, when in fact there are bodies of work within the literature that have given quite a lot of attention to measurement. The IGT is one such case

02.02.2026 14:00 👍 4 🔁 0 💬 2 📌 0

Not denying the mess 😁

bsky.app/profile/nate...

02.02.2026 13:41 👍 1 🔁 0 💬 0 📌 0

Not to discount the fact that measurement and use of the IGT is generally a mess in the literature! Because it certainly is. But I think these meta reviews/analyses often miss sound areas of research when looking at the forest. Models are the best measurement tools we have for these tasks

02.02.2026 13:30 👍 4 🔁 0 💬 1 📌 1

Great post! One thing the IGT paper misses is that, despite what a bunch of folks are doing with sum scores, there is a rich history of building principled models of the IGT, dating back to Busemeyer & Stout (2002). The paper leaves you thinking the whole literature is a mess, which is not true IMO

02.02.2026 13:17 👍 4 🔁 0 💬 2 📌 0

that's not a bad idea 🤓

31.01.2026 01:21 👍 0 🔁 0 💬 0 📌 0

appreciate it! and this is never going to be a paper.. im blogging for the love of the game, the publishing process would kill the fun

31.01.2026 00:15 👍 0 🔁 0 💬 1 📌 0

im blogging for the move of the game, the publishing process would kill the fun

31.01.2026 00:14 👍 0 🔁 0 💬 0 📌 0

An aside—LLMs have basically solved data viz for me. What used to take a good bit of tinkering in ggplot or plotly now can be one-shotted with a data snippet/schema as context and a plain English description of what I want. The bar for data viz is now higher 🤓

30.01.2026 20:49 👍 11 🔁 0 💬 0 📌 0

ah so an actual paper and not a blog project?? lol

29.01.2026 14:36 👍 1 🔁 0 💬 1 📌 0

Ooo very nice, will be curious to read more on that!

29.01.2026 13:49 👍 0 🔁 0 💬 1 📌 0

Thank you!

And yes you definitely could, although I think drift rate is the most plausible mechanism in this case. e.g. the conflicting stimuli produce trade offs during evidence accumulation that dampen the accumulation process.

That said, task manipulations (eg priming) may influence other pars

29.01.2026 13:42 👍 1 🔁 0 💬 0 📌 0

In case you missed it 🤓

27.01.2026 12:38 👍 2 🔁 0 💬 0 📌 0

9/9 The interpretation depends on how you evaluate discriminant validity. Implicit and explicit attitudes share ~ 67% of variance, but that still leaves 33% unique variance. Regardless, the IAT is a very noisy measure, and I wouldn't trust results that don't account for measurement error 🤓

26.01.2026 17:45 👍 4 🔁 0 💬 0 📌 0

8/9 Low loadings mean low reliability, and low reliability means summary measure correlations are attenuated by measurement error. But! Our models account for this, and model 2 reveals a latent correlation of r = .82. Model comparison adds to the story, altogether ruling out r = 0 and r =1

26.01.2026 17:45 👍 1 🔁 1 💬 1 📌 0

7/9 After fitting our models, we can derive standardized loadings that tell us the correlation between the latent construct(s) and the indicators (drift rates for the IAT; items for self-reports). The IAT loading is generally quite low, although so are some of the survey items:

26.01.2026 17:45 👍 1 🔁 0 💬 1 📌 0

6/9 So, these models better accounts for task- and -item specific variance when estimating the latent constructs, but we also need to consider measurement error. To do so, we model all data simultaneously, making different assumptions about the latent implicit-explicit correlation:

26.01.2026 17:45 👍 1 🔁 0 💬 1 📌 0

5/9 For self-report items, depending on whether the item requires a Likert vs continuous thermometer response, we can use categorical vs continuous models inspired by Item Response Theory (IRT). The left and right plots here illustrate how IRT models can capture either type of item:

26.01.2026 17:45 👍 3 🔁 0 💬 1 📌 1

4/9 Of course, summary measures like the IAT D-Score or summed scores from surveys both: (1) conflate the underlying construct of interest with task- or item-specific variation; and (2) ignore measurement error. We can do better with generative models, starting with a conflict model of the IAT:

26.01.2026 17:45 👍 2 🔁 0 💬 1 📌 0

3/9 The reason it gets complicated—meta-analyses consistently reveal correlations around r = .2-.3 between IAT D-scores and self-report measures, which IAT proponents use to claim discriminant validity (implicit constructs must be real and different from explicit). Our data follow this pattern:

26.01.2026 17:45 👍 1 🔁 0 💬 1 📌 0

2/9 The basic question we are trying to resolve is deceptively simple: "What is the relationship between implicit attitudes (as measured by the IAT) and explicit attitudes (as measured by self-report)?". This question has created a ton of controversy, both in and outside of academia

26.01.2026 17:45 👍 2 🔁 0 💬 1 📌 0

1/9 New blog is live! This is part 2 of a series—last time we looked at the Dunning-Kruger effect, now we are digging in to Implicit vs Explicit attitudes and the Implicit Association Test. To start, of course we need a good meme...

haines-lab.com/post/part-2-...

26.01.2026 17:45 👍 42 🔁 15 💬 4 📌 3

the hardware angle is truly bizarre, seems desperate tbh

i mean look at rabbit r1 lol

19.01.2026 22:06 👍 1 🔁 0 💬 0 📌 0

and a fun look into the data generating process for the implicit association test 🧐

19.01.2026 15:43 👍 4 🔁 0 💬 0 📌 0

coming soon! part 2 of the series (finally..), this time looking at implicit attitudes instead of the Dunning-Kruger effect—here is a sneak peek 🤓

19.01.2026 15:43 👍 9 🔁 0 💬 2 📌 0

Nathaniel Haines

Latest posts by Nathaniel Haines @natehaines