For those tuned into the field of behavioural science, you cannot have missed the recent meta-analysis gone viral: “The effectiveness of nudging: A meta-analysis of choice architecture interventions across behavioral domains”. And as always when someone “attacks” the holy grail that seems to be nudge, there was outrage. Now I don’t tend to be late to the party but I thought I’d let emotions simmer down before I even tried reading the paper (I also had a busy week but that’s neither here nor there). Upon reading the abstract I’m already wondering whether I’m reading the right paper. It very clear states there nudges do have an effect: “Here we quantitatively review over a decade of research, showing that choice architecture interventions successfully promote behavior change across key behavioral domains, populations, and locations.” And further: “Our results show that choice architecture interventions overall promote behavior change with a small to medium effect size of Cohen’s d = 0.43 (95% CI [0.38, 0.48]). In addition, we find that the effectiveness of choice architecture interventions varies significantly as a function of technique and domain…. Overall, choice architecture interventions affect behavior relatively independently of contextual study characteristics such as the geographical location or the target population of the intervention. Our analysis further reveals a moderate publication bias toward positive results in the literature.” None of this seems to scream to me that behavioural science is dying. So what’s the panic about?
I continue reading, like the good ex-academic that I am [may I also congratulate the authors on actually writing an easy to read paper, not a small feat!]. They reference several other papers that look at behavioural intervention techniques in meta-analyses, single-shot interventions reviews and the effectiveness of nudges in specific behavioural domains (for you reference, that’s references 29-39 in the paper). Here is when we have to stop reading chronologically, as PNAS (the journal) has certain criteria for meta-analyses which means the literature review is followed by the results, which reiterate what we already knew from the abstract: “ meta-analysis of 447 effect sizes from 212 publications revealed a statistically significant effect of choice architecture interventions on … Using conventional criteria, this effect can be classified to be of small to medium size. The effect size was reliable across several robustness checks, including the removal of influential outliers.” Maybe more interesting are the next two paragraphs. The first mentioning the heterogeneity in the sample: “The total heterogeneity was estimated to be τ2 = 0.16, indicating considerable variability in the effect size of choice architecture interventions. More specifically, the dispersion of effect sizes suggests that while the majority of choice architecture interventions will successfully promote the desired behavior change with a small to large effect size, ∼15% of interventions are likely to backfire, …. with a small to medium effect.” So not all nudges work (quelle surprise). For the final nail in the coffin: the effect of publication bias. Before we slam dunk behavioural science into oblivion, I would like to mention that this is not a field specific problem, this is a publication and therefore an academia wide problem. Cool, weird, exciting, new stuff with positive results get published, the rest has a fat chance. This is also (partially) why we’re having the replication crisis. Now, what did the authors find: “… revealed an asymmetric distribution that suggested a one-tailed overrepresentation of positive effect sizes in studies with comparatively low statistical power. This finding was formally confirmed by Egger’s test which found a positive association between effect sizes and SEs ... Together, these results point to a publication bias in the literature that may favor the reporting of successful as opposed to unsuccessful implementations of choice architecture interventions in studies with small sample sizes. … this one-tailed publication bias could have potentially affected the estimate of our meta-analytic model. Assuming a moderate one-tailed publication bias in the literature attenuated the overall effect size of choice architecture interventions by 22.5% from Cohen’s d = 0.40, 95% CI [0.36, 0.44], τ2 = 0.16 (SE = 0.01) to d = 0.31, τ2 = 0.18. Assuming a severe one-tailed publication bias attenuated the overall effect size even further to d = 0.08, τ2 = 0.26; however, this assumption was only partially supported by the funnel plot. Although our general conclusion about the effects of choice architecture interventions on behavior remains the same in the light of these findings, the true effect size of interventions is likely to be smaller than estimated by our meta-analytic model due to the overrepresentation of positive effect sizes in our sample.” It’s obviously the last point which had people in a tizzy, because a Cohen’s d of 0.08 well, that’s not good. Actually, it’s pretty f’ing bad. However, this claim is literally followed by the fact that it is based on an only partially supported assumption, from a visual representation (the funnel plot is a graph from which the authors read the shape of the distribution). So honestly, this is not the number we’re looking for. All that can really be taken away from this meta-analysis, which the authors fully admit to and even emphasize, is that “the true effect size of interventions is likely to be smaller than estimated by our meta-analytic model (Cohen’s d of 0.43) due to the overrepresentation of positive effect sizes in our sample.” That’s it. The next paragraphs than analyze nudge in specific domains and find that “the effectiveness of interventions was moderated by domain... Specifically, it showed that choice architecture interventions…, had a particularly strong effect on behavior in the food domain, with… the smallest effects observed in the financial domain… this domain was less receptive to choice architecture interventions than the other behavioral domains we investigated.” Gotcha.
For those who are a bit new to meta-analyses, you really cannot just read the results and move on. You’re going to have to check the actual methods applied to the analysis, with a main focus being on sample selection. Sample selection in this case means: which studies were selected, which weren’t, and for both of those: why? The methods could be found towards the end of the paper. The authors references two meta-analysis frameworks (both lauded and widely applied). They clearly stated the resources used (journals and databases), search terms, time based restrictions, with a snowballing technique for reference hunting. Further criteria focused on methods (RCT), outcome measure, unit of analysis and language. Nothing out of the ordinary here, and maybe more importantly: can be replicated. The only thing that might read as a tad off is that the initial selection had 9,606 publications and this got reduced to 212. What I expect the main cutting point to have been here is the RCT criteria. A lot of nudges don’t tend to get the RCT treatment. I cannot say with 100% certainty if that’s it, because even the diagram in the Supporting Information section doesn’t give that amount of detail (I dig deep for these posts, don’t you worry!). If someone could get me more details on the exclusions I’d love to have them, because to me, they are the most important part!
The methods and supporting info did not give up the amount of secrets I had hoped for, but I do have to say I couldn’t spot anything really out of order, so I cautiously move back “up” to the discussion and conclusion sections. The authors mention different meta-analyses done on the topic and why results often seem to be inconsistent. The reason? Different in- and exclusion criteria, which means you end up with a different sample. And if there’s one thing we know about behavioural science, it is that a sample can make or break an effect (we’re nowhere near finding a universal theory of human behaviour, if ever). The discussion then dives deeper into specific interventions consistently working better than others, interventions in the food domain and, as always, the need for further research. Nothing controversial here either. All gets wrapped up nicely by the conclusion, with a reiteration that nudges are effective. I genuinely am convinced I read a completely different paper than most of the angry twitter mob.
So, 1000+ words and a very good paper later, where’re we at? The same place as before, I’m afraid. Nudges do work, but they can backfire. They have a small to medium effect size when they do work. Again, not controversial, we’ve known this for ages. There’s a publication bias. Bloody obviously. There is nothing controversial in this paper. If you’re “shocked” by any of its findings, you haven’t been paying attention to any of the developments in the field for the past 15 years. And I feel like that’s on you, not the authors of this meta-analysis. So yes, I confess; the title of today’s post was nothing but clickbait. Nudge isn’t dead yet, and won’t be rolling in its grave anytime soon. But what this does tell us is that more work is required to amplify all the non-nudge work that is being done in behavioural science. Because there is a lot of it. And it’s being done a disservice by being “rocked” by a single meta-analysis. Whether that analysis had good or bad things to say about the effectiveness of nudge.