The Need for Some Agreement Before Debating Proposals to Address the “Replicability Crisis”

“For a debate to proceed, both teams need a clear understanding of what the motion means. This requires the motion to be ‘defined’ so that everyone (audience and adjudicators included) knows what is being debated. Problems arise if the two teams present different understandings of the meaning of the motion. This can result in a ‘definition debate’, where the focus of the debate becomes the meaning of the words in the motion, rather than the motion itself. Interaction and clash between the two teams concentrates on whose definition is correct, rather than the issues raised by the motion. Definition debates should be avoided wherever possible. They make a mockery of what debating seeks to achieve.” (Stockley, 2002)

One debate occurring across many scientific disciplines, including my own (social psychology), focuses on what should be done, if anything, to deal with the “replicability crisis” (i.e., the apparent inability of some study findings to be directly replicated within and across labs). As suggested by Stockley in the quote above, for a proper debate on the “replicability crisis” (i.e., the motion) to ensue, participants need to agree on some basic facts surrounding the issue being discussed. Only then can the strength of arguments for or against various propositions for resolving the issue be evaluated.

In the current “replicability crisis”, what are some facts we can all agree upon when debating resolutions? From my own reading the past few years, here are a few things that seem fairly straightforward for our field to agree on that have the potential to be problematic for the reliability of published research findings:

1) In a series of simulations, Colquhoun (2014) demonstrated that “…if you use p = .05 as a criterion for claiming that you have discovered an effect you will make a fool of yourself at least 30% of the time.” (p. 11). Stated differently, approximately 30% of statistical analyses will yield statistically significant findings (p ≤ .05) when in fact no effect truly exists (false positives, or Type I error), a rate much higher than the typically accepted rate of 5%. I direct readers to Colquhoun’s paper, published in an open access journal (citation information below), to verify these claims. Ioannidis (2005) made similar arguments.

2) The overwhelming majority of published research papers contain statistically significant results — well over 90% of all presented findings are statistically significant, whereas very few papers publish non-significant findings (Fanelli, 2010; Sterling, 1959; Sterling, Rosebaum, & Weinkam, 1995).

3) Considering the two previous points in concert, it is undeniable that a non-trivial amount of published research findings are false-positives. Given that very few non-significant findings are published, the rate of false negatives (or claims of no effect when an effect truly exists) in the published literature is therefore very low in comparison.

The first point is based on large scale simulations using the types of statistical tests our field typically employs, and the second point is based on observations of actual published research in our journals. The third point represents a logical conclusion obtained by pairing the first two points. There are other important issues related to the “replicability crisis”, such as the use of questionable research practices to obtain p values at or below the accepted threshold of .05, but it is difficult to ascertain the prevalence of these practices and I will not include them on this list.

If we agree that presently a non-trivial amount of published research findings are false-positives for the reasons discussed above, we can then debate the merits of different proposal to address this issue. But, if there is disagreement among participants in this debate on the relative amount of published research that contain false positives, then it becomes very difficult to evaluate the strength of different arguments put forward because the arguments are not discussing the issue at hand but rather the definition of the motion. And if we simply keep debating the definition of the motion (i.e., the prevalence of false positives in the literature), then it is difficult to envision any proposal on how to address this issue receiving a critical mass of support within the field. It may also be the case that as a field we would be making “…a mockery of what debating seeks to achieve”.

References

Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p-values. Royal Society Open Science. 1: 140216. http://dx.doi.org/10.1098/rsos.140216.

Fanelli, D. (2010). “Positive” results increase down the hierarchy of Science. PLoS ONE, 5(4), e10068. doi:10.1371/journal.pone.0010068.

Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, e124.

Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance—Or vice versa. Journal of the American Statistical Association, 54, 30–34.

Sterling, T. D., Rosebaum,W. L., &Weinkam, J. J. (1995). Publication decisions revisited: The effect of the outcome of statistical tests on the decision to publish and vice versa. The American Statistician, 49, 108–112.

Stockely, A. (2002). Defining motions & constructing cases: Guidelines for competitors and adjudicators. Retrieved from http://www.schoolsdebate.com/docs/definitions.asp on November 26, 2014.

The Current Status of Pre-registering Study Details in Social Psychology

The past few years has witnessed much debate regarding research practices that can potentially undermine the accuracy of reported research findings (e.g., p-hacking, lack of direct replication, low statistical power; Ioannidis, Munafo, Fusar-Poli, Nosek, & David, 2014; O’Boyle, Banks & Gonzalez-Mule, 2014; Simmons, Nelson, & Simonsohn, 2011), and some leading journals that publish research in the field of social psychology have made editorial changes to address these issues (e.g., Eich, 2014; Funder et al., 2014; Journal of Experimental Social Psychology, 2014). Pre-registration of study hypotheses and methods has been suggested as one way to enhance the accuracy of reported research findings by making the research process more transparent (e.g., Campbell, Loving & LeBel, in 2014; Chambers, 2014; De Groot, 1956/2014; Krumholz & Peterson, 2014; Miguel et al., 2014; The PLOS Medicine Editors, 2014). Many journals now have a registered reports section where editors and reviewers focus on the strength of pre-registered methods and data analytic plans to test proposed hypotheses and accept articles for publication in advance of data collection (e.g., Perspectives on Psychological Science). A new journal, titled Comprehensive Results in Social Psychology (CRSP), supported by the European Social of Social Psychology as well as the Society of Australasian Social Psychologists, is the first social psychology journal to publish only pre-registered papers. Are researchers in the field of Social Psychology, however, presently following these suggestions by adopting the practice of pre-registering details of their studies?

There are different ways to answer this question, and the approach I adopted here was to cross-reference the current membership of the Society of Experimental Social Psychology (SESP; accessed October 1, 2014) with all current users of the Open Science Framework (OSF; accessed October 1-2, 2014). The membership of SESP was selected to represent the field of social psychology for the following reasons: (1) membership is open to any researcher regardless of disciplinary affiliation, (2) individuals are only eligible to be considered for membership after holding a PhD for five years and following an evaluation by committee of the degree to which their publication record advances the field of social psychology, and (3) there are presently over 1000 members in institutions all over the world. Members of SESP therefore represent a cross-section of recognized social psychological researchers. I selected the user list of the OSF because since its launch in 2011 it has positioned itself as the most recognized third-party website for posting study details in the social sciences.

To conduct the cross-referencing I first recorded all of the names listed in the membership directory of SESP (http://sesp.org/memlist.htm) in a spreadsheet. I then typed each name into the search window of the OSF website (https://osf.io) to identify current user status. If an individual was listed as a user, I navigated to his/her user page to determine (a) the number projects the user has currently posted to the OSF website, and (b) how many of these projects were currently public (i.e., fully accessible by any visitor to the site). User status, number of projects, and number of public projects were entered into the spreadsheet. It is important to note that posted projects refer to studies already conducted or currently being conducted given that project details remain on the site over time.

Descriptive analyses revealed that of the 1002 current members of SESP, 98 (or 9.8%) had created accounts on the OSF website. The two highest frequencies were for posting zero projects (i.e., having an account only; 26.5%) and for posting one project (35.7%). The frequencies for posting more than one project then decrease very rapidly. The number of public projects is 44%, meaning that the details of a slight majority of projects (56%) are not publicly available. This is perhaps understandable given that researchers may prefer to wait to share pre-registered study details when a manuscript containing data from a given study has been accepted for publication.

On the one hand it is a positive development to see that in a relatively short period of time (i.e., since 2011) close to 10% of researchers identified by their peers as making significant contributions to the study of social psychology (i.e., members of SESP) have created a user account on the OSF, the most prominent online site devoted to increasing transparency in the research process. On the other hand, over 90% of SESP members currently are not users on the OSF, and the individuals that are users presently post very few projects. It is very likely that the low number of posted projects does not reflect the actual number of research projects conducted (active or completed) in the respective labs of SESP members that have posted projects on the OSF. It can then be concluded that pre-registering of study details is currently a very uncommon practice in the field of social psychology, at least within the membership of SESP, on the OSF. There presently exists a gap, therefore, between the suggestions to pre-register study details to enhance transparency of the research process and the employment of this practice among active researchers in social psychology.

This practice is likely to become more common going forward, but one potential explanation for the low rate of pre-registering study details at this time is the potential concern that it is cumbersome, that not all study hypotheses are established at the time of data collection, and that other researchers may “scoop” posted hypotheses and methods (see Campbell et al., 2014). To the extent that these pose real risks to researchers adopting pre-registration, the act of pre-registration itself could be argued to hurt the advance of ideas in our field. This argument is largely philosophical at this time given that there is simply not enough empirical evidence upon which to evaluate this possibility.

 

References

Campbell, L., Loving, T.J., & LeBel, E.P. (2014). Enhancing transparency of the research process to increase accuracy of findings: A guide for relationship researchers. Personal Relationships. DOI: 10.1111/pere.12053

Chambers, (2014). Psychology’s ‘registration revolution’. Retrieved from: http://www.theguardian.com/science/head-quarters/2014/may/20/psychology-registration-revolution.

De Groot, A. D. (1956/2014). The meaning of “significance” for different types of research. Translated and annotated by Eric-Jan Wagenmakers, Denny Borsboom, Josine Verhagen, Rogier Kievit, Marjan Bakker, Angelique Cramer, Dora Matzke, Don Mellenbergh, and Han L. J. van der Maas. Acta Psychologica, 148, 188-194.

Eich, E. (2014). Business not as usual. Psychological Science, 25, 3-6.

Funder, D.C., Levine, J.M., Mackie, D.M., Morf, C.C., Vazire, S., & West, S.G. (2014). Improving the dependability of research in personality and social psychology: Recommendations for research and educational practice. Personality and Social Psychology Review, 18, 3-12.

Ioannidis, J.P., Munafo, M.R., Fusar-Poli, P., Nosek, B.A., & David, S.P. (2014). Publication and other reporting biases in cognitive sciences: Detection, prevalence, and prevention. Trends in Cognitive Science, 18, 235-241.

Journal of Experimental Psychology (2014). JESP editorial guidelines. Retrieved from http://www.journals.elsevier.com/journal-of-experimental-social-psychology/news/jesp-editorial-guidelines/.

Krumholz, H.M., & Peterson, E.D. (2014). Open access to clinical trials data. The Journal of the American Medical Association. 312, 1002-1003.

Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K.M., Gerber, A….Van der Laan, M. (2014). Promoting transparency in social science research. Science343(6166), 30-31.

O’Boyle, Jr., E.H., Banks, G.C., & Gonzalez-Mule, E. (2014). The Chrysalis effect: How ugly initial results metamorphosize into beautiful articles. Journal of Management. doi 10.1177/0149206314527133).

Simmons, J.P., Nelson, L.D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22, 1359-1366.

The PLOS Medicine Editors, (2014). Observational studies: Getting clear about transparency. PLoS Med 11(8): e1001711. doi:10.1371/journal.pmed.1001711.