A Bunch of BS...

Merle van den Akker
Jan 14, 2021
4 min read

Updated: Jan 15, 2021

One of my previous articles was on GAABS – a new gatekeeper for good behavioural science. Today’s article is about what came before GAABS, and what hopefully won’t get their stamp of approval: a whole load of BS coming from the behavioural sciences.

You can’t genuinely be surprised that we’ve produced really questionable stuff. Every field has. Just for some fields this happened decades if not centuries ago. Well, behavioural science isn’t that old. So, it’s happening now. A great example of Behavioural Science BS was a best-selling book on personality types, or colours, if you will. The book is called “Surrounded by Idiots,” which is exactly what the author must have thought once it hit the best-seller list. The entire book focuses on the “science” of personality types. You could test yourself on certain traits and they’d be linked to a colour. So you could be predominantly green with a bit of blue. What that meant still isn’t 100% clear to me, but it would probably mean you scored high on traits such as extraversion, mixed in with a tinge of openness. This medium article goes even more in depth, if you’d want more background. What happened here? Well the author, when appearing in public, very quickly showed his true colours (pun intended). The dude knew very little about psychology or behavioural science. It was one hell of a hoax. It was wild. This is exactly the type of BS I enjoy. But it’s only really funny in hindsight. And as this happened in 2018, it’s been 2 years of hindsight.

One of my favourite behavioural scientists and writers, Jason Collins, takes a critical stance as well. He looks into the most famous studies within the field, and their failure to replicate. Oops. A notable example is of course the “old age priming” experiment. Participants were made to unscramble words. Some were in a neutral condition, others were in the “age” condition, where they unscrambled words such as wrinkly, grey, old etc. Then the participants were timed walking from the lab to the elevator, and those in the old age condition walked significantly slower. Awesome, we can prime old age. Not sure why you’d want to, but you can. Or can you? Because it didn’t replicate, did it… One issues with this experiment is quite simple: you can’t just test 30 students and call it a win, regardless of the results.

This isn’t exactly the only study that hasn’t replicated. Jason mentions quite a few more, both within the priming domain (very controversial domain anyway), and outside of the priming domain. Another favourite gets heckled: the bat and the ball problem. Don’t worry, we’re not throwing out the idea of system 1 vs. system 2, we haven’t reached that stage yet. The idea was that people who read the bat&ball problem in a harder to read font, or lighter colour took more time to approach the problem, slowing them down and allowing for system 2 to kick in. This meant they scored higher (they got the answer right more often). Which is great. But it didn’t replicate. The original study was done with 40 students (you can see where this went wrong right?), and did not replicate when conducted with thousands of participants. I’m not exactly an expert in power calculations, but I can figure out what happened here. The small sample wasn’t representative. Who would have thought…

Now there is no need for me to go one and continue to bash these studies. Although being hateful is my second nature, there isn’t much point. These studies and their authors have had rotten tomatoes thrown at them for a long time. And the idea isn’t to do that. The idea is to be a bit more critical of what we’re being presented with. As soon as I see that something has been tested on less than 200 people, especially if it has multiple conditions, I become quite sceptic. Because it means the study is likely underpowered. And even if you find cool things, they are not very likely to replicate on a larger scale. Unless the effect you found is super robust. And that’s just the issue. A lot of things and effects found at “the start of behavioural science” have been heralded as robust and generalizable, whereas that’s not necessarily the case. This is exactly how the replication crisis started (well, part of it). We need to do better. Samples need to increase. Testing needs to go beyond students at universities. That’s not a representative sample. At all. Testing needs to go beyond the States, or the Western world in general. Out of WEIRD , into reality. We have been neglecting to do so for too long.

Also, if the title of this post sounded familiar to you – I shamelessly ripped of Nick Hobson’s podcast “A bunch of BS.” If you haven’t listened to that yet, please do give it a try. It’s a great show 😊