Pre-registration vs. Doing Actual Science

You’ve read the title. You know what this article is going to be about.



To kick off, let me start by saying I am a proponent of both open science and the pre-registration of it. I believe in locking in experimental design (if applicable) and a plan of analysis based on the (expected) data. It’s a great way of ensuring against p-hacking. Assuming that people were honest in their pre-registration from the start (who’s going to prove you weren’t?). However, the thing I’m most interested in is having all the files uploaded and freely available once the research is done, so others can either check or replicate the study itself. You might be wondering, after an introduction like that, where the hiccup lies. Simple: I’ve done too many pre-registrations now that look nothing like analyses I have actually conducted. And I wasn’t p-hacking in the slightest.


 

For my very first paper I wrote a pre-registration for the set-up and analysis of the first experiment. The experiment, well, in hindsight more like a real-time survey, was dirt simple. The analysis, as a result, should have been simple too. We had one type of model (Gaussian dispersion), on one type of dependent variable (price recall error), with one main hypothesis and independent variable of interest (payment method), with several pre-registered covariates (number of items, point of sale, time of day, day of week etc.). You get the picture. Easy does it. However, easy did not do it at all. A Gaussian dispersion model requires the data to be distributed in a specific way, which it was not. There goes that then. The distribution of price recall error was not remotely as expected, so that went out the window quite quickly too. Shame that. So suddenly, you find yourself writing an explanation in your actual paper for why you didn’t manage to follow the pre-registration to the letter. That section takes up an embarrassingly long part of experiment 1’s result section. Awesome. Another issue we ran into in the pre-registration: who do you exclude and why? A reason for pre-registering exclusion criteria is that you don’t cut down your sample in a way which “suddenly” makes your results significant and/or in the direction of your hypotheses. You can defend your reasoning in the pre-registration for applying certain exclusion criteria (e.g. all spends above 45 pounds are excluded because contactless payment methods cannot be used for any amount above 45 pounds). One of the easiest exclusion criteria to defend is that of missing data. However, once you have locked this in and you’ve collected your data, there may be other exclusion criteria you want to work with. One of your pre-registered criteria might have been based on prior work and not applicable to your work at all. Or, you might have missed an exclusion that makes a lot of sense pertaining to your work. In my case, I had to throw out 81 observations as these were transactions made by the Warwick Student Card, a pre-paid card which didn’t fit the contactless/non-contactless divide. Having both a pre-registered set of exclusion criteria as well as some additional exclusion criteria mentioned in the result section looks messy in the best case scenario, and shady in the worst case scenario (leaning towards p-hacking). I can promise you the first two paragraphs of my study 1 results section are atrocious. You live and you learn.


Now you might be thinking that’s where the issues ended with the pre-registration, and I just had a bad run with it, but that’s not true at all. That was just my view of the analysis. And your own view when doing a paper is not the only view. There’s also the journal reviewers. As expected, my change of both model and dependent variable raised some eyebrows. Comments were made about the analysis not following the pre-registered plan, as well as the exclusion criteria not lining up. However, with clear descriptions as well as detailed footnotes, you can save your ass. The main thing to come out of this paper submission was a revise & resubmit (a.k.a. “r&r”). We took the r&r and through the comments it became clear that the reviewers needed another study, which established a causal connection. Okay then. New study, new pre-registration.


 

Having learned some valuable lessons from the prior pre-registration, we knew what the design would look like (now with some small tweaks for both causality and covid-19 restrictions), as well as knowing the analysis: a complete copy+paste job from what we actually did in study 1. So the pre-registration now features linear models rather than Gaussian dispersion models, and was focused on the dependent variable of correct, which was a 0/1 dummy variable indicating whether people correctly recalled their expenditure, or not, rather than using error, which was calculated by subtracting the actual expenditure from the estimated expenditure. It wasn’t an exact copy+paste job either, as we controlled for some more variables, but it was pretty close. Did that work for the analysis? Yes. There were very few changes that needed to be made, on our part. Several months later we resubmit the manuscript. We hear back three weeks later, which I thought was pretty quick. When I saw the email I was hoping for an early Christmas gift – that the paper would have been approved for publication. Alas, this was not the case (woe is me, boohoo). Looking at the comments it was pretty clear that one of the reviewers (just guess their bloody number), wasn’t happy with what was now study 2 – their mean grievance was the dependent variable. They didn’t want a measure of correct, they wanted the analysis conducted on error. From a pre-registration standpoint, I was quite confused as to how I could be asked for this new analysis. First, they had read study 1 before the resubmission, as it was in the original submission. They knew the variable we had chosen, and didn’t reject it the first time around. Second, given this “silent approval” of correct as a variable, I pre-registered study 2 with the same variable, not even mentioning error. What is the best thing to do here? Obviously, I will implement the reviewer’s suggestion, but what we now have is two studies with completely mismatched pre-registrations. The first, I’ll admit, was mainly my fault. It was the very first study in my PhD, so there was definitely a learning curve. The second? Well, that’s the reviewer’s fault. And here we have science full circle.


This article is not a way for me to hate on pre-registrations. As I said in the introduction, I am a massive proponent of pre-registrations, especially when it comes to knowledge sharing. If this paper ever gets published, with either error or correct as the dependent variable, I’ll make sure to include a note outlining both our and the reviewers’ decision making process, so readers know what’s going on. My real reason for writing this post was just to show the ideal vs. the reality of doing pre-registrations. There’s good reasons for doing pre-registrations, but often, the reality pans out very different from the theorized expectation, As it always does with science.