The 5Ws of Preregistration

This post is a long awaited follow up to my February 2018 post titled “All About Pre-Registration”. I began that post with the following: “Presently there is a large degree of variability regarding the understanding and application of pre-registration in psychological science.” Six years later, there STILL is not consensus about many aspects of preregistration (yes, I am dropping the “-“). During this time I have personally found it a little odd how there has been so many opinions expressed in published papers and particularly social media about preregistration yet the people making these comments likely share divergent views *of* preregistration. Some of the arguments against preregistration that make me shake my head the most are that “it is not a panacea” (so the bar for introducing practices to potentially improve our science is that it has to be a solution for ALL problems in our field?), “it stifles creativity and exploration” (well, maybe for you…), “I cannot be expected to know all my hypotheses with my big study that has longitudinal components” (totally get it, but when I read your papers you often say “as expected” or “as predicted”, so…), “preregistration is only for purely confirmatory research” (ok, but what is your definition of preregistration because it is not one I am familiar with), and “people deviate from them so preregistration does not work” (work for what exactly? And let’s forget for a moment that the only way to obtain information regarding deviations between research intentions as stated in a preregistration and actions as written about in a published article are because the preregistration exists). But I also hear arguments from proponents of preregistration say things like “preregistration is not needed for that type of research” (for now of course preregistration is not something that seems to be “needed” to publish research papers in general in that it is not universally mandatory, but this statement implies there is only value for preregistering certain kinds of research activities as well as certain types of information regarding these research activities).

Given the lack of general agreement about the nature and goals of preregistration, I have put together in this post answers to the 5Ws of preregistration: Who? What? Where? When? and Why? The answers are obviously not agreed upon by everyone (see the first paragraph of this post), and they are based on my own experiences with preregistration and other open science practices the past ten years. That includes working with my own lab to generate preregistrations for our research that ranges from primarily confirmatory to primarily exploratory (and everything in between), discussing the practice with colleagues when invited to present on the topic at conferences as well as invited departmental talks, reading social media posts way to often, and publishing papers on the topic (e.g., Campbell, Loving, LeBel, 2014; LeBel, E.P., Campbell, L., & Loving, T.J., 2017; Moshontz, H., Campbell, L., … Chartier, C.R., 2018; Nosek, B.A., Beck, E., Campbell, L., Flake, J.K., Harwicke, T.E., Mellor, D.T., van’t Veer, A.E., & Vazire, S., 2019).

To begin, I put together a brief cheat sheet of the five Ws of preregistration and some brief answers below:

The 5Ws of Preregistration:

Who? – Any individual or group of researchers that evaluate ideas via the collection and analysis of data.

What? – Preregistration is both a concept as well a concrete action, or actions. A working definition of preregistration that is generally agreed upon by those tasked with engaging in the practice is therefore needed to encourage progress on standardizing the practice of preregistration. An agreed upon definition and subsequent translation of this definition to a standard set of practices regarding preregistration currently does not exist.  

When? – Typically preregistration should occur prior to the proposed actions that are specified in the preregistration are carried out.

Where? – Preregistration information should be timestamped and publicly available in perpetuity without the option to delete shared information. Any updates or additions to a preregistration should be transparent to anyone viewing the preregistration information.

Why? – The goal, or goals, motivating the practice of preregistration. There is currently a great deal of disagreement regarding the purpose of preregistration, meaning it is possible for one person to declare that preregistration “works” and another to declare that it “does not work”, with each statement being correct given the different inferred goals of the practice. This is not a desirable situation.

In the next sections, I provide expanded answers to the five Ws of preregistration (this is the first draft of this post on January 30, 2024, and I will likely add to it and lightly edit).

Who?

In my field of psychology, people typically develop and advance their careers by generating research questions and/or hypotheses that can be assessed by designing research studies and comparing study outcomes with expectations. Results that contrast with expectations can be very useful to generate novel insights. A researcher may also evaluate data in the absence of expectations or with vague expectations, developing new research ideas based on the pattern of observations across the set of available variables in a given data set. Overall, any researcher that obtains data for the purpose of discovery has the option to share up front in an open and transparent manner their research intentions and planned actions.

What?

In January of 2024 I conducted an informal internet search for definitions of preregistration. The top hits of my search appear below.

APA.org: “Preregistration allows researchers to specify and share details of their research in a public registry before conducting the study.”

COS.io: “When you preregister your research, you’re simply specifying your research plan in advance of your study and submitting it to a registry. Preregistration separates hypothesis-generating (exploratory) from hypothesis-testing (confirmatory) research.”

“What is pre-registration? Pre-registration involves making hypotheses, analytic plans, and any other relevant methodological information for a study publicly available before collecting data.”

COS from May 13, 2023: Preregistration is the practice of documenting your research plan at the beginning of your study and storing that plan in a read-only public repository such as OSF Registries or the National Library of Medicine’s Clinical Trials Registry.

Surrey.ac.uk: “The goal is to create a transparent plan, ahead of beginning your study/accessing existing data. So, your preregistration will include details about the study, hypotheses (if you have them), your design, sampling plan, search strategy (for reviews) variables, exclusion and inclusion criteria, and the analysis plan.”

The Administration for Children and Families (USA): “Pre-registration is the practice of deciding your research and analysis plan prior to starting your study and sharing it publicly, like submitting it to a registry.”

PLOS.org: Preregistration is the practice of formally depositing a study design in a repository—and, optionally, submitting it for peer review at a journal—before conducting a scientific investigation.

Common Themes: Publicly sharing details of planned research projects in an appropriate registry prior to carrying out the planned research actions.

My working definition of preregistration:

Stating as clearly and specifically as possible what you plan to do, and how, before doing it, in a manner that is verifiable by others (From: https://www.lornecampbell.org/?p=181)

All of the above definitions of preregistration are consistent in that they refer to publicly sharing intentions prior to actions. Where many differences of opinion occur is not with these somewhat vague definitions, but instead with the translation from the conceptual (“create and publicly share research plan prior to carrying out research plan”) to the concrete (from “you can share plans for all of your research” to “sharing only needed and/or useful for specific data analytic decisions for confirmatory hypotheses and nothing else”). It seems to me, therefore, that there is actually some general agreement regarding the higher order definition of preregistration, but important differences between researchers on the translation of this definition for different types of research.

In the figure below, I refer to research intentions and actions. With respect to intentions, I roughly dissect research projects into explorations and hypothesis testing research. Explorations loosely include purely descriptive research (e.g., assessing base rates of behavior in a given population) as well as “fishing” (e.g., calculating correlations between study variables, running a lot of models that include different variables or same variables with different combinations of items, trimming the sample, and so on). Hypothesis testing includes evaluating vague hypotheses (e.g., two variables should be positively correlated, the mean for group A should be higher than group B) as well as very concrete hypotheses (e.g., using data obtained via specific measures the correlation between two variables should be in a given range, or the mean for Groups A and B should be in a given range and differ by more than 1 unit). These categories are simply meant to cover research that at one end has very limited expectations regarding outcomes to research that has very specific expectations regarding outcomes. With respect to actions, I list four categories of the types of concrete information regarding a researcher’s intentions that can be publicly shared. What is obvious from this conceptualization of the research process as it relates to preregistration is that it is primarily the degree of specificity of the information shared (actions) that changes as a function of the research intentions, NOT whether information should be shared or not shared. In other words, you can preregister all of your research projects, with some preregistrations including a lot of very specific information compared to others. If at this point you disagree with what I just said regarding preregistration, it is very likely we have different views on the goal(s) of preregistration. So keep reading until the end.

When?

Prior to doing what you plan to do (will be updated with more information)

Where?

OSF, aspredicted.org, git/github (will be updated with more information)

Why?

            Overall, what is the desired outcome, or outcomes, associated with the practice of preregistration? Ask people that believe in the value or preregistration, and those that do not, this question and I predict that you will get a variety of responses. Whereas researchers may have general agreement about a higher order definition of preregistration, there are important differences of opinion with respect what types of research should be preregistered and I believe this is largely because of disparate beliefs about the goal(s) of preregistration. What does it help with?

Before moving on, when researchers ask this type of question it sounds like this in my head: “If you want me to share my research intentions with you before I have done said research then you need to prove to me that there is some value in doing so or else I declare it is all a waste of time.” Here is something we wrote in a paper published in 2014: “Ideally we should not need to persuade researchers of the benefits of disclosing details of the research process; instead, researchers should need to provide solid rationale for not openly sharing these details.” (Campbell, Loving, & LeBel, 2014, p. 542). I suppose what I am about to say now is somewhat controversial, but consistent with the 2014 version of myself I believe researchers should preregister by default unless there is solid rationale, that needs to be shared, for not doing so.

Primary Goal of Preregistration: The open and transparently shared decision making process, from intentions to proposed actions, for a research endeavour to allow for enhanced evaluation.

The availability of this information should allow for achieving many sub-goals (when relevant to a particular research endeavour), such as:

  • Defining research questions and/or pre-planned hypotheses as formed by the researcher(s) prior to collecting data and/or examining existing data
    • Often described as delineating exploratory from confirmatory tests of research questions and/or hypotheses
      • Whereas some explorations can be planned in advance (e.g., obtaining base rate information for some outcomes in a given group or groups; wanting to see differences in a variety of measures between particular groups; estimating correlations among responses to particular measures), other explorations occur during the process of examining the data obtained to assess the research questions and/or hypotheses
  • Detailing with appropriate specificity the research methods the researcher(s) plan to use to evaluate the research questions and/or hypotheses, including:
    • Target population and planned sample
    • Rationale for sample size
    • The planned information to be obtained and how the researcher(s) will obtain this information (e.g., want to obtain the age of participants in the sample, so will ask study participants to indicate their age in years and months or ask study participants to indicate their date of birth; want to assess self-esteem so will ask participants to complete a particular self-esteem scale or scales)
    • Any planned manipulation in the study design, described in appropriate detail to allow others to implement the manipulation in a like manner
    • The planned procedures for conducting all aspects of the study
    • Any data exclusion rules
    • Plans for how data will be analyzed in specific detail as appropriate given the nature of the research questions and/or hypotheses as well as the methods used to obtain the data

Secondary Goals of Preregistration: Preregistration is believed by some advocates to help increase the quality of published research reports in the following ways:

  • Eliminate HARKing, or hypothesizing after the results are known. In other words, viewing the results from a given set of analyses and then tailoring hypotheses to match the results obtained
  • Eliminate p-hacking, or conducting many statistical tests in many different ways in order to obtain a p value that is lower than the traditionally accepted cut off of .05 and then presenting this result (or results) as the only test that was conducted to test the hypothesis
  • Eliminate outcome switching, or stating that a given result was the primary test planned all along when in fact it was not
  • Enhance the severity of a statistical test of a concrete hypothesis by following the pre-specified data analytic plan
  • Enhance the reproducibility of the study methods and procedures, as well as the data analyses. Each type of reproducibility can assist with (a) the evaluation of study claims, as well as (b) re-running the study to determine the replicability of the findings in another sample of participants

Any preregistration, even one that is poorly constructed, helps achieve the primary goal of being open and transparent regarding one’s intentions and proposed actions. Some preregistrations may assist with achieving the secondary goals, whereas some may not. But being open and transparent with your research intentions and proposed actions in a preregistration achieves the primary goal of preregistration. So do it.

References

Campbell, L., Loving, T.J., & LeBel, E.P. (2014). Enhancing transparency of the research process to increase accuracy of findings: A guide for relationship researchers. Personal Relationships, 21, 531-545. DOI: 10.1111/pere.12053

LeBel, E.P., Campbell, L., & Loving, T.J. (2017). Benefits of open and high-powered research outweigh costs. Journal of Personality and Social Psychology, 113, 230-243. DOI: 10.1037/pspi0000049

Moshontz, H., Campbell, L., … Chartier, C.R. (2018). The psychological science accelerator: Advancing psychology through a distributed collaborative network. Advances in Methods and Practices in Psychology Science, 1, 510-515. DOI: 10.1177/2515245918797607

Nosek, B.A., Beck, E., Campbell, L., Flake, J.K., Harwicke, T.E., Mellor, D.T., van’t Veer, A.E., & Vazire, S. (2019). Preregistration is hard, and worthwhile. Trends in Cognitive Science. DOI: 10.1016/j.tics.2019.07.009