Week 2: Why Should Science be Open and Reproducible?

In one word: evaluation.

The motto of the Royal Society is Nullius in verba, or “Take nobody’s word for it”. When evaluating scientific claims, this means focusing on the appropriateness of the research process (e.g., study design, analytic design, interpretation of outcomes, and so on) rather than accepting the claims based on the authority of the individual(s) proposing them. To evaluate the reported outcomes of a study, it is therefore critical to have access to details of the research process that produced those outcomes.

For many years now, these details have typically been shared in the print pages of academic journals. Academic journals typically have page budgets, or a contractually agreed upon number of print pages that can appear in a one year period. Longer individual papers therefore translate into fewer papers published in a given journal each year, meaning that Editors have often asked authors to shorten their papers during the review process. Given the complexity of many experimental designs, and the volume of information collected in many large scale observational studies, it is not feasible to include all methodological details or results within each paper. Authors have typically provided as much detail about how the study was conducted they felt necessary to interpret the results presented, with the promise that omitted details were available upon request if so desired.

But are these details really available upon request? There are many reasons to believe that at least some details would not be available upon request. For example, individuals may change careers and no longer have access to these materials; over time we all die and are not able to provide these details; issues with storage (of both physical and digital copies) can arise that render the details no longer accessible; people are busy and may not have the time to search for these details when asked. And the list goes on. As suggested by the readings for this week, it turns out that the majority of requested research details are not in fact available upon request! This includes access to data and analytic code, things that are important to be available given the non-trivial amount of statistical reporting errors in the literature. Nullius in verba implies that we should not take the word of researchers that say “available upon request”.

The paper by Simmons, Nelson, and Simonsohn (2011) in Psychological Science, as well as the interesting information presented on www.flexiblemeasures.com, also suggest that evaluation of scientific claims requires knowing what research decisions (e.g., hypotheses, use of measures, data analytic plans) were made prior to collecting data and/or data analyses versus after. For example, regarding the use of the Competitive Reaction Time Task (CRTT) in research assessing aggressive behavior, it has been shown that responses to this task have been quantified for data analysis in 156 different ways within 130 academic publications (http://www.flexiblemeasures.com/crtt/)!  It seems likely that some of the decisions regarding how to quantify the CRTT within these publications were made during the process of data analysis (i.e., using a data-driven approach to determine a quantification strategy that yielded “significant” results), but it is impossible to determine which ones in the absence of pre-registration of data analytic strategies. Simmons et al. (2011) also demonstrate how the application of different Questionable Research Practices (QRPs) (e.g., checking for significance, collect more data, check again; trying out a few different DVs; adding covariates and/or moderators) can inflate the Type I error rate well beyond the 5% level typically adopted by researchers using a frequentist data analytic approach. We need to know when decisions were made during the research process, and why, in order to properly evaluate the reported results of the research.

Current technology allows for sharing important details of the research process as the research is being considered, conducted, analyzed, and written up into a manuscript. We are no longer limited by how many print pages a traditional journal is budgeted to print each year and then mail to subscribers. We are also not limited with respect to when we share details of the research process—we no longer have to wait to publish everything we can squeeze into a manuscript.

This is an exciting time to be a scientist, but of course it is scary for some scientists to consider all of these changes occurring in a fairly short period of time. In my opinion, greater openness and transparency of the research process is the future of science. There is so much that is being done, and that can be done, to ensure our research is open and reproducible. Next week we start at the beginning of the research process.

Open and Reproducible Science—Introduction to my new Graduate Course

In 2014 my lab began the transition to using open and reproducible research practices (I wrote a blog post about it). Almost a year later, and after a steep learning curve, I realized I needed to organize my open science. There was a lot of discussion within the field of psychology at that time on the idea of open science, but few suggestions on how to actually implement different open science practices throughout the research workflow for different types of research projects. Along with Etienne LeBel and Tim Loving we thought it through, published a paper in December 2014 with some specific recommendations (or so they seemed like it at the time), and then our lab made it up as we went along. To my pleasant surprise I was asked to give a workshop on “doing open science” in November 2015 at the University of Toronto, Mississauga. I really enjoyed talking to faculty and students about this topic, and I was honored to be asked to give similar workshops/talks at many different places during the next two years. Overall, I have now given 14 presentations on open and reproducible science in Canada, the USA, New Zealand, and Turkey (at the bottom of this post there is a list of these talks, including links to slides and recordings where available). I am also very happy to see that in the 3 or so years since publishing our open science recommendations, many journals in the field of psychology are changing their editorial policies to align with open science practices.

As I developed and tweaked my slides over this two year period, I learned a lot more about (a) what is being done in different fields to enhance research transparency and reproducibility, and (b) what can be done with existing technology. With all of this information, I decided to create a new graduate course called “Open and Reproducible Science” so I could share with trainees how they can begin their research career in a way that makes their future publications more open to evaluation and more likely to be reproducible (in many ways) by others (something I suggested was lacking in our graduate training programs here). I put together a syllabus and solicited feedback via Twitter. I received many helpful suggestions, as well as two offers for guest lectures—one by Seth Green from CodeOcean.com, and on from Joanne Paterson, a librarian at Western University. Click here to see what I ended up putting together for this inaugural course. I am excited that 16 grad students from different areas of my psychology department enrolled for this course, beginning January 11, 2018.

My goal is to write expanded lecture notes in a blog post for each week of the class. In these posts I will discuss my planned talking points for each class, as well as flesh out specific examples of how one might use different open science practices throughout the research workflow. Ok, now I need to go re-read the assigned article for the first class: Munafo et al’s (2017) “A manifesto for reproducible science”.

Invited Talks on Open Science and Replication

2015

  • November 3, Workshop presented at the University of Toronto, Mississauga (Psychology), Canada

2016

  • January 28, Pre-Conference of the Society of Personality and Social Psychology (SPSP), San Diego, USA
  • June 10, Conference of the Canadian Psychological Association, Victoria, Canada
  • October 3, York University (Psychology), Canada (audio recording)
  • October 11, University of Toronto (Psychology), Canada
  • October 19, University of Guelph (Family Relations and Applied Nutrition), Canada
  • October 21, Illinois State University, (Psychology), USA
  • November 11, Victoria University Wellington (Psychology), New Zealand
  • November 24, University of Western Ontario (Clinical Area), Canada
  • December 2, University of Western Ontario (Developmental Area), Canada

2017

  • January 19, Workshop presented at Sabanci University, Istanbul, Turkey (with thanks to a Travel Grant awarded to Asuman Buyuckan-Tetik and me from the European Association of Social Psychology)
  • March 10, Western Research Forum Panel Discussion on Open Access: “What’s in it for me?”, London, Canada
  • May 25, Workshop presented at the conference of the Association for Psychological Science (APS), Boston, USA
  • November 10, Plenary address, conference of the Society for the Scientific Study of Sexuality (SSSS), Atlanta, USA

Pre-Registered Publications From Our Lab

Updated

Below is a list of now published studies (as of October 20, 2017) that had pre-registered (a) hypotheses, (b) procedure and materials, and (in most cases) (c) a data analytic plan. We have more original empirical studies under review and in preparation. When I compiled this list I found it interesting that of the six original empirical pre-registered publications, three are in open access journals. We also are currently collecting data for a registered report involving videotaping lab based interactions between romantically involved partners. Our lab has also been active in conducting, and publishing, replication studies. We have published seven replication studies to date, including one Registered Replication Report including data from 16 independent labs. Four of these publications resulted from the group project in my graduate research methods course. Another publication (“Self-esteem, relationship threat…”) was conducted by all members of the lab and involved running over 200 romantically involved couples through a lab based manipulation, one couple at a time; it took one year to collect the data. Upon publication of this paper, however, we did receive our $1000 pre-registration challenge prize money and had a wonderful “lab night out”.

In my view the practice of pre-registration has been helpful in many ways, such as (a) helping clarify what we truly expect to emerge and what we simply think might happen, (b) making us ask ourselves why we are including each measure (why is it relevant/important?), (c) allowing us to develop our data analytic code while data is being collected because we already thought out our data analytic plan, and (d) making it easier to write the manuscript when we are finished given that we have already largely written the methods section, as well as written the rationale for our hypotheses and data analytic plan. I will let others judge if this practice has stifled our creativity (but look at this study, not yet published, before making your final judgement: https://osf.io/yksxt/).

Note: links to the OSF project pages are located on the journal name.

Original Empirical Studies

Dobson, K., Campbell, L., & Stanton, S.C.E. (in press). Are you coming on to me? Bias and accuracy in couples’ perceptions of sexual advances. Journal of Social and Personal Relationships.

Kohut, T., Balzarini, R.N., Fisher, W.A., & Campbell, L. (in press). Pornography’s associations with open sexual communication and relationship closeness vary as a function of dyadic patterns of pornography use within heterosexual relationships. Journal of Social and Personal Relationships.

Balzarini, R.N., Campbell, L., Kohut, T., Holmes, B.M., Lemiller, J.J., Harman, J.J., & Atkins, N. (2017). Perceptions of primary and secondary relationships in polyamory. PLoS ONE 12(5): e0177841. https://doi.org/10.1371/journal.pone.0177841.

Buyukcan-Tetik, A., Campbell, L., Finkenauer, C., Karremans, J.C., & Kappen, G. (2017). Ideal standards, acceptance, and relationship satisfaction: Latitudes of differential effects. Frontiers in Psychology, doi: 10.3389/fpsyg.2017.01691.

Campbell, L., Chin, K., & Stanton, S.C.E. (2016). Initial evidence that individuals form new relationships with partners that more closely match their ideal preferences. Collabra, 2(1), p.2. DOI: http://doi.org/10.1525/collabra.24

Stanton, S.C.E., & Campbell, L. (2016). Attachment avoidance and amends-making: A case advocating the need for attempting to replicate one’s own work. Journal of Experimental Social Psychology, 67, 43-49.

In Principle Agreement (Registered Report)

Hahn, C., Campbell, L., Pink, J.C., & Stanton, S.C.E. (In principle agreement). The role of adult attachment orientation in information-seeking strategies employed by romantic partners. Comprehensive Results in Social Psychology.

Replication Studies

Babcock, S., Li, Y., Sinclair, V., Thomson, C., & Campbell, L. (2017). Two replications of an investigation on empathy and utilitarian judgment across socioeconomic status. Scientific Data 4, Article number: 160129, doi: 10.1038/sdata.2016.129

Balakrishnan, A., Palma, P.A., Patenaude, J., & Campbell, L. (2017). A 4-study replication of the moderating effects of greed on socioeconomic status and unethical behaviour. Scientific Data 4, Article number: 160120, doi: 10.1038/sdata.2016.120

Balzarini, R.N., Dobson, K., Chin, K., & Campbell, L. (2017). Does exposure to erotica reduce attraction and love for romantic partners in men? Independent replications of Kenrick, Gutierres, and Goldberg (1989) study 2. Journal of Experimental Social Psychology, 70, 191-197.

Campbell, L., Balzarini, R.N., Kohut, T., Dobson, K., Hahn, C.M., Moroz, S.E., & Stanton, S.C.E. (2017). Self-esteem, relationship threat, and dependency regulation: Independent replication of Murray, Rose, Bellavia, Holmes, and Kusche (2002) Study 3. Journal of Research in Personality. https://doi.org/10.1016/jrp.2017.04.001.

Cheung, I., Campbell, L., & LeBel, E.P., …Yong, J.C. (2016). Registered replication report: Study 1 from Finkel, Rusbult, Kumashiro, & Hannon (2002). Perspectives on Psychological Science, 11, 750-764.

Connors, S., Khamitov, M., Moroz, S., Campbell, L., & Henderson, C. (2016). Time, money, and happiness: Does putting a price on time affect our ability to smell the roses? Journal of Experimental Social Psychology, 67, 60-64.

LeBel, E.P., & Campbell, L. (2013). Heightened sensitivity to temperature cues in highly anxious individuals: Real or elusive phenomenon? Psychological Science, 24, 2128-2130.

A Commitment to Better Research Practices (BRPs) in Psychological Science

Scientific research is an attempt to identify a working truth about the world that is as independent of ideology as possible.  As we appear to be entering a time of heightened skepticism about the value of scientific information, we feel it is important to emphasize and foster research practices that enhance the integrity of scientific data and thus scientific information. We have therefore created a list of better research practices that we believe, if followed, would enhance the reproducibility and reliability of psychological science. The proposed methodological practices are applicable for exploratory or confirmatory research, and for observational or experimental methods.

  1. If testing a specific hypothesis, pre-register your research[1], so others can know that the forthcoming tests are informative. Report the planned analyses as confirmatory, and report any other analyses or any deviations from the planned analyses as exploratory.
  2. If conducting exploratory research, present it as exploratory. Then, document the research by posting materials, such as measures, procedures, and analytical code so future researchers can benefit from them. Also, make research expectations and plans in advance of analyses—little, if any, research is truly exploratory. State the goals and parameters of your study as clearly as possible before beginning data analysis.
  3. Consider data sharing options prior to data collection (e.g., complete a data management plan; include necessary language in the consent form), and make data and associated meta-data needed to reproduce results available to others, preferably in a trusted and stable repository. Note that this does not imply full public disclosure of all data. If there are reasons why data can’t be made available (e.g., containing clinically sensitive information), clarify that up-front and delineate the path available for others to acquire your data in order to reproduce your analyses.
  4. If some form of hypothesis testing is being used or an attempt is being made to accurately estimate an effect size, use power analysis to plan research before conducting it so that it is maximally informative.
  5. To the best of your ability maximize the power of your research to reach the power necessary to test the smallest effect size you are interested in testing (e.g., increase sample size, use within-subjects designs, use better, more precise measures, use stronger manipulations, etc.). Also, in order to increase the power of your research, consider collaborating with other labs, for example via StudySwap (https://osf.io/view/studyswap/). Be open to sharing existing data with other labs in order to pool data for a more robust study.
  6. If you find a result that you believe to be informative, make sure the result is robust. For smaller lab studies this means directly replicating your own work or, even better, having another lab replicate your finding, again via something like StudySwap.  For larger studies, this may mean finding highly similar data, archival or otherwise, to replicate results. When other large studies are known in advance, seek to pool data before analysis. If the samples are large enough, consider employing cross-validation techniques, such as splitting samples into random halves, to confirm results. For unique studies, checking robustness may mean testing multiple alternative models and/or statistical controls to see if the effect is robust to multiple alternative hypotheses, confounds, and analytical approaches.
  7. Avoid performing conceptual replications of your own research in the absence of evidence that the original result is robust and/or without pre-registering the study. A pre-registered direct replication is the best evidence that an original result is robust.
  8. Once some level of evidence has been achieved that the effect is robust (e.g., a successful direct replication), by all means do conceptual replications, as conceptual replications can provide important evidence for the generalizability of a finding and the robustness of a theory.
  9. To the extent possible, report null findings. In science, null news from reasonably powered studies is informative news.
  10. To the extent possible, report small effects. Given the uncertainty about the robustness of results across psychological science, we do not have a clear understanding of when effect sizes are “too small” to matter. As many effects previously thought to be large are small, be open to finding evidence of effects of many sizes, particularly under conditions of large N and sound measurement.
  11. When others are interested in replicating your work be cooperative if they ask for input. Of course, one of the benefits of pre-registration is that there may be less of a need to interact with those interested in replicating your work.
  12. If researchers fail to replicate your work continue to be cooperative. Even in an ideal world where all studies are appropriately powered, there will still be failures to replicate because of sampling variance alone. If the failed replication was done well and had high power to detect the effect, at least consider the possibility that your original result could be a false positive. Given this inevitability, and the possibility of true moderators of an effect, aspire to work with researchers who fail to find your effect so as to provide more data and information to the larger scientific community that is heavily invested in knowing what is true or not about your findings.

We should note that these proposed practices are complementary to other statements of commitment, such as the commitment to research transparency (http://www.researchtransparency.org/). We would also note that the proposed practices are aspirational.  Ideally, our field will adopt many, of not all of these practices.  But, we also understand that change is difficult and takes time.  In the interim, it would be ideal to reward any movement toward better research practices.

Brent W. Roberts, Rolf A. Zwaan, Lorne Campbell

[1] van ’t Veer, A. E., & Giner-Sorolla, R. (2016). Pre-registration in social psychology—A discussion and suggested template. Journal of Experimental Social Psychology, 67, 2–12. doi:10.1016/j.jesp.2016.03.004

The Twelve Days of Open Science

On the first day of Open Science, research mavericks gave to me, a huge study called the RP:P.

On the second day of Open Science, research mavericks gave to me, the Center for Open Science, And a huge study called the RP:P.

On the third day of Open Science, research mavericks gave to me, a list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the fourth day of Open Science, research mavericks gave to me, The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the fifth day of Open Science, research mavericks gave to me, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the sixth day of Open Science, research mavericks gave to me, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the seventh day of Open Science, research mavericks gave to me, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On eighth day of Open Science, research mavericks gave to me, registered replication reports, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the ninth day of Open Science, research mavericks gave to me, open access journals, registered replication reports, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the tenth day of Open Science, research mavericks gave to me, pre-print servers, open access journals, registered replication reports, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the eleventh day of Open Science, research mavericks gave to me, pre-registration, pre-print servers, open access journals, registered replication reports, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

On the twelfth day of Open Science, research mavericks gave to me, Facebook discussion groups, pre-registration, pre-print servers, open access journals, registered replication reports, a team of second string researchers, replication studies, P-Hacking! The Open Science Framework, A list of QRPs, The Center for Open Science, And a huge study called the RP:P.

My 2016 Open Science Tour

I have been asked to discuss my views on open science and replication, particularly in my field of social psychology, nine times in 2016 (see my “Open Science Tour” dates below). During these talks, and in discussions that followed, people wanted to know what exactly is open science, and how might a researcher go about employing open science practices?

Overall, many similar questions were asked of me from faculty and students so I thought I would create a list of these frequently asked questions. I do not provide a summary of my responses to these questions, instead wanting readers to consider how they would respond. So, how would you answer these questions? (public google doc for posting answers)

  1. Given that many findings are not, and in many cases cannot, be predicted in advance, how can I pre-register my hypotheses?
  2. If my research is not confirmatory, do I need to use open science practices? Isn’t open science only “needed” when very clear hypotheses are being tested?
  3. How can I share data?
    • What data do I “need” to share? (All of it? Raw data? Aggregated data?)
    • What platforms are available for data sharing? (and what is the “best” one?)
    • What format/software should be used?
    • Is this really necessary?
    • How should I present this to my research ethics board?
  4. Can I publicly share materials that are copyrighted?
  5. What is a data analytic plan?
  6. Is it really important to share code/syntax from my analyses?
  7. Can’t researchers simply “game the system”? That is, conduct research first, then pre-register after results are known (PRARKing), and submit for publication?
  8. Can shared data, or even methods/procedures, be treated as unique “citable units”?
  9. If I pilot test a procedure in order to obtain the desired effects, should the “failed” pilot studies be reported?
    • If so won’t this bias the literature by diluting the evidence in favor of the desired/predicted effect obtained in later studies?
  10. How much importance should I place on statistical power?
    • Given that effect sizes are not necessarily knowable in advance, and straightforward procedures are not available for more complex designs, is it reasonable to expect a power analysis for every study/every analysis?
  11. If I use open science practices but others do not, can they benefit more in terms of publishing more papers because of fewer “restrictions” on them?
    • If yes, how is this fair?

Unique question from students:

  1. Could adopting open science practices result in fewer publications?
  2. Might hiring committees be biased against applicants that are pro open science?
  3. If a student wants to engage in open science practices, but his/her advisor is against this, what should this student do?
  4. If a student wants to publish studies with null findings, but his/her advisor is against this, what should this student do?
  5. Will I “need” to start engaging in open science practices soon?
  6. Will it look good, or bad, to have a replication study (studies) on my CV?
  7. What is the web address for the open science framework? How do I get started?

My Open Science tour dates in 2016 (links to slides provided):

  • January 28, Pre-Conference of the Society of Personality and Social Psychology (SPSP), San Diego, USA
  • June 10, Conference of the Canadian Psychological Association, Victoria, Canada
  • October 3, York University (Psychology), Canada (audio recording)
  • October 11, University of Toronto (Psychology), Canada
  • October 19, University of Guelph (Family Relations and Applied Nutrition), Canada
  • October 21, Illinois State University, (Psychology), USA
  • November 11, Victoria University Wellington (Psychology), New Zealand
  • November 24, University of Western Ontario (Clinical Area), Canada
  • December 2, University of Western Ontario (Developmental Area), Canada

Tone Deaf

In my field of psychological science there have been many discussions the past few years on the way an argument is expressed, its tone. A common theme is the general desire for academic discussions to be positive and respectful, and not mean and antagonistic. With the release of Susan Fiske’s commentary on the state of scientific communication (see a detailed discussion of the commentary in the context of other developments in the field the past decade here), the discussion of “tone” has heated up again. This is particularly true for the Facebook discussion group “PsychMap” where the tone of communication is closely monitored.

The following of course is simply my own opinion, and I respect that others disagree with this opinion, but I do not really care that much about the tone of an argument. A person can offer up a positive, or neutral, argument and be full of shit, or not. A person can offer up a negative, sarcastic, even rude argument and be on the mark, or not. If you have sat through a few faculty meetings  you will know exactly what I mean. Personally, I do my best (and sometimes my best is not good enough, to be honest) to focus on the argument being presented and not on how the argument is presented. I can only control (a) how I decide to put forward my own arguments (asshole or angel, or somewhere in between), and (b) how I respond to others’ arguments. In my opinion the tone of argument reflects more on the person delivering the argument than on the target of the argument. I accept that if I choose to deliver my arguments in a manner most of my colleagues would perceive as obnoxious and combative that I may not be taken so seriously by these colleagues for very long. I personally therefore choose to be positive, or at least direct in a fairly neutral manner, with the majority of my arguments (hopefully as reflected in my blog posts the past year and a half, and in my papers on meta-scientific issues). I therefore prefer discussions not to be officially moderated, and to let people own the words they choose to use to present their views. The field of academic psychology is literally a community of highly educated individuals that are smart enough to know the difference between shit and Shinola; we can figure out if an argument, however presented, has substance or not.

And for what it’s worth, it seems to me that the majority of discussions I am privy to in private and on social media are positive and constructive in tone. That is nice.

 

Organize your Data and Code for Sharing from the Start

On September 12, 2016, experimental psychologist Christopher Ferguson created a “go-fund-me” page to raise funds for access to an existing data set that was used to advance scientific arguments in a scientific publication (link here). In Ferguson’s own words: “So I spoke with the Flourishing Families project staff who manage the dataset from which the study was published and which was authored by one of their scholars.  They agreed to send the data file, but require I cover the expenses for the data file preparation ($300/hour, $450 in total; you can see the invoice here).” Ferguson’s request has generated a lot of discussion on social media (this link as well), with many individuals disappointed that data used to support ideas put forward in a scientific publication are only available after a big fee is paid. Others feel a fee is warranted given the amount of effort required to put together the data requested into one file, as well as instructions regarding how to use the data file. And in the words of one commenter, “But I also know people who work with giant longitudinal datasets, and preparing just the codebook for one of those, in a way that will make sense to people outside the research team, can take weeks.” (highlighting added by me).

As someone that has collected data over time from large numbers of romantically involved couples, I agree that it would it take some time to prepare these data sets and codebooks for others to understand. But I think this is a shame really, and is a problem in need of a solution. If it takes me weeks to prepare documentation to explain my dataset organization to outsiders, I am guessing it would take the same amount of time to explain the same dataset organization to my future self (e.g., when running new analyses with an existing data set), or a new graduate student that wants to use the data to test new ideas, not to mention people outside of the lab. This seems highly inefficient for in-lab research activities, and represents the potential loss of valuable data to the field given that others may never have access to my data in the event that (a) I am too busy to spend weeks (or even hours for other data sets) putting everything together for others to make sense of my data, and (b) I die before I put these documents together (I am 43 with a love of red meat, so I could drop dead tomorrow. I think twice before buying green bananas).

So what is my proposed solution? Organize your data and code from the start with the assumption that you will need to share this information (see also “Why scientists must share their research code”). Create a data management plan at the beginning of all your research projects. Consider how the data will be organized, where it will be stored, and where the code for data cleaning/variable generation, analyses, and plots will be stored. Create meta-data (information about your dataset) along the way, updating as needed; consider where to store this meta-data from the beginning. If you follow these steps, your data, meta-data, and code can be available for sharing in a manner understandable to other competent researchers in a matter of minutes, not weeks. Even for complex data sets. Your future self will thank you. Your future graduate students will thank you. Your future colleagues will praise your foresight long after you are dead, as your [organized] data will live on.

Update: see Candice Morey’s post on the same topic.

 

How to Publish an Open Access Edited Volume on the Open Science Framework (OSF)

Edited volumes are collections of chapters on a particular topic by various experts. In my own experience as a co-editor of three (3) edited volumes, the editors select the topic, select and invite the experts (or authors), and identify a publisher. Once secured, a publisher typically offers a cash advance to the editor(s) along with a small percentage of sales going forward in the form of royalties. The publisher may also provide reviewing services for the collection of chapters, and will advertise the edited volume when it is released. The two primary ways for consumers to access the chapters is to (a) purchase the book, or (b) obtain a copy of the book from a library.

With technological advances it is now possible to publish edited volumes without a professional publishing company. Why would someone choose to not use a publishing company? Indeed, they are literally publication experts. Perhaps the biggest reason is that the resulting volume will be open access, or available to anyone with a connection to the internet, free of charge. There are also some career advantages to sharing knowledge open access. Also, a publishing company is simply not needed for all publication projects.

There are very likely many different ways to publish an edited volume without using a professional publishing company. Below, I outline one possibility that involves using the Open Science Framework (OSF). Suggestions for improving these suggested steps are welcome.

Steps to Using the OSF to publish an Open Access Edited Volume

  1. Identify a topic for the edited volume, and then identify a list of experts that you would like to invite to contribute chapters.
  2. If you do not have an OSF account, create one (it is free). Create a new project page for your edited volume, and give it the title of the proposed edited volume. Select one of the licensing options for your project to grant copyright permission for this work.
  3. Draft a proposal for your edited volume (e.g., the need for this particular collection of chapters, goals of the volume, target audience, and so on). Add this file to the project page.
  4. Send an email inviting potential authors, providing a link to your OSF project page so they can read your proposal.
    • You can make the project page public from the start and simply share the link, or,
    • You can keep the project page private during the development of the edited volume and “share” a read-only link to the project page with prospective authors only.
  5. Ask all authors that accepted the invitation to create on OSF account. Then create a component for each individual chapter; components are part of the parent project, but are treated as independent entities in the OSF. Use the proposed title for each chapter as the title of the component. Add the author(s) as administrators for the relevant component (e.g., A. Smith has agreed to author chapter #4; add A. Smith as an administrator of component #4).
  6. Ask authors to upload a copy of their first draft by the selected deadline. Provide feedback on every chapter.
    • One option is to download a copy of the chapter, make edits using the track changes option, and then upload a copy of the edited chapter using the same title as the original in order to take advantage of the “version control” function of the OSF (i.e., all versions of the chapter will be available on the project page in chronological order, with the most recent version at the top of the list).
  7. Ask authors to upload their revised chapter using the same title (again to take advantage of the “version control” function of the OSF).
  8. When the chapters are completed, “register” the project and all components. This will “freeze” all of the files, meaning changes can no longer be made. The registered components, or chapters, represent the final version of edited volume. Then…
    • Make all of the components, as well as the main project registration, public;
    • Enable the “comments” option so that anyone can post comments within each component (e.g., to discuss the material presented in the chapter);
    • Click the link to obtain a Digital Object Identifier (DOI) for each component (i.e., chapter).
  9. Advertise the edited volume
    • Use social media, including Facebook discussion groups and Twitter (among others). Encourage readers to leave comments for each chapter on the OSF pages;
    • Ask your University to issue a press release;
    • Ask your librarian for tips on how to advertise your new Open Access edited volume (librarians are an excellent resource!!).

Prior to following these steps to create your own Open Access edited volume on the OSF (or by using a different approach), there are some pros and cons to consider:

Pros

  • You have created an edited volume that is completely Open Access
  • The volume cost no money to create, no money to advertise, and no money to purchase
  • Given that the chapters are available to a wider audience than a traditional edited volume released by a for profit publishing company, it is likely that they will actually reach a wider audience as well and have a greater scientific impact

Cons

  • You do not receive a cash advance or royalties
  • You do not receive any assistance from a publisher for reviewing or advertising
  • This approach is new compared to traditional publishing, and therefore you may be concerned that you will not receive proper credit from others (e.g., people evaluating your contributions to science when deciding to hand out grant funds, jobs, promotions, and so on)

Final Thoughts

There is usually more than one way to achieve the same aim. Professional publishing companies work with academics to create many edited volumes every year, but creating an edited volume does not inherently require the assistance of a professional publishing company. The purpose of this post was to present one alternative using the functionality of the Open Science Framework to publish an edited volume that is Open Access. I am sure there are even more ways to achieve this aim.

How much Research is Confirmatory Versus Exploratory?

The president of APS is nervous about pre-registration, or the idea of writing down study goals and hypotheses prior to collecting and/or analyzing data. One concern is that we do not have any data on whether or not pre-registration puts limits on exploration within research programs. If researchers are required to pre-register study goals and/or hypotheses, and given that in many instances good ideas are developed after seeing the data (not always before), then many good ideas may never be tested. This is of course a fair question worthy of discussion.*

But what we perhaps need to know first is approximately how much of our collective research is exploratory at present? We know that over 90% of all journal articles report statistically significant effects (no citation required), presumably for hypotheses developed prior to data collection. If so, then these data analyses have been presumably conducted in a confirmatory manner (i.e., to test hypotheses developed prior to data collection and/or analyses). Pre-registering these confirmatory hypotheses should therefore not be problematic or stifle discovery, particular given current options that make pre-registering hypotheses very easy (e.g., the Open Science Framework, aspredicted.org). If these confirmatory hypotheses took time to develop via exploratory research, then this suggests a massive amount of exploratory research is currently not being reported in any publication outlet; this research represents the large part of the iceberg hidden beneath public perception, with the small confirmatory bit of research peeking into public awareness. If so, we should collectively figure out a way to make this large body of exploratory research, and the details of how these explorations helped researchers develop their confirmatory hypotheses, publicly available. This is important stuff!**

To the extent the current literature, however, is not primarily presenting a priori hypotheses and confirmatory data analyses, then it will contain a blend of confirmatory hypotheses and hypotheses developed during and/or after data analyses (i.e., exploration within the research program). Given that over 90% of all journal articles report statistically significant effects, and that not all articles contain sections that clearly delineate confirmatory hypotheses and those developed from exploration with the data being presented, it is therefore an open question of how much research is confirmatory versus exploratory. Pre-registration of study goals and/or hypotheses, both confirmatory and exploratory (and everything in between), may be one way to answer this question. And perhaps before setting up large scale randomized control trials to determine if pre-registration can limit exploration, we should know just how much exploration is actually going on, as well as the links between this exploration and confirmatory hypotheses that are subsequently developed. Many of us seem to agree that exploration is very important, so let’s make an effort to document our explorations more clearly and openly.

 

* Russ Poldrack is on record as not being nervous about pre-registration

** “stuff” is a technical term of course