Week 3: Open Notebook

The focus of discussion this week was on elements of the open science process to consider at the beginning of the research process. We started with a brief discussion of how the benefits of open and high powered research outweigh costs (Lebel, Campbell, & Loving, 2017). I then provided a tutorial on how to use the Open Science Framework (OSF) to manage the research workflow. Here is a link to a useful webinar. When I talk to people that are not familiar with using the OSF they are confused regarding the differences between “projects” and “registrations”, and think the OSF is primarily used to “pre-register” study hypotheses. I share my belief that whereas the OSF is useful for pre-registrations, it is even more useful as a free web platform for managing the research process with collaborators. In fact, a researcher could choose to keep all project pages private (i.e., not share with the public) and never pre-register anything at all, but use the OSF to efficiently manage his or her research workflow. That seems unlikely to happen, but the point is that projects on the OSF are dynamic with a great deal of functionality over the course of the research process. In our lab we use the OSF to store research materials (scales, procedures, data files, code, ethics applications, and so on), but also more and more to document communications between collaborators during the research process. And that is where open notebook comes in.

Open notebook is “…the practice of making the entire primary record of a research project publically available online as it is recorded.” At first blush this does not seem all that different from basic “open science” principles whereby the researcher publicly shares research materials, specifies hypotheses in advance, outlines a data analytic plan, and then later on shares data, meta-data and code (all topics to be covered in future classes). But the concept of open notebook also includes sharing the laboratory notebook, something that often contains communications between collaborators, as well as “dear diary” types of entries, that shed light on the decision making process throughout the research process. An open notebook can take on many different forms, and there are some excellent examples from the medical bio-sciences here, here, and here. These three examples take the form of dedicated web pages with regular updates, and I admire their commitment to “extreme open science”.

Rather than create a dedicated website for our lab notebooks, I wanted to develop an approach that uses what our lab already uses on a daily basis—the OSF. That is where all of our research materials are stored for our projects already, but what has been missing to date is the nature of the communications between colleagues during the research process that results in particular decisions being made. And decisions need to be made regularly as new issues arise that were not considered earlier.

Along with current graduate student Nicolyn Charlot, we are trying out the following open notebook approach. First, rather than using email for basic communications for a current project (this one), we decided to make use of the “comments” function that is available for every project page (the “word bubble” icon located at the top right of the screen). That way our messages are documented over time as they occur and are embedded with all of our research materials. With notifications for comments set to “on” (under the settings tab), we receive emails when a new comment has been added. Second, because many of our decisions are made during lab meetings and not via email, we decided to briefly document the decision making process following each lab meeting in a shared google doc file that is linked to a component nested within the main project page titled “Open Notebook”. Here is the open notebook for our project. It is within this component where we communicate using the comments function (click on it and see what we have so far), and where we keep the shared file. Our goal will be to have an “Open Notebook” component for all new projects going forward as a way to document the decision making process, and to add a more personal element to the project pages beyond simply being a repository of files.

I am curious to see how it works out.

Week 2: Why Should Science be Open and Reproducible?

In one word: evaluation.

The motto of the Royal Society is Nullius in verba, or “Take nobody’s word for it”. When evaluating scientific claims, this means focusing on the appropriateness of the research process (e.g., study design, analytic design, interpretation of outcomes, and so on) rather than accepting the claims based on the authority of the individual(s) proposing them. To evaluate the reported outcomes of a study, it is therefore critical to have access to details of the research process that produced those outcomes.

For many years now, these details have typically been shared in the print pages of academic journals. Academic journals typically have page budgets, or a contractually agreed upon number of print pages that can appear in a one year period. Longer individual papers therefore translate into fewer papers published in a given journal each year, meaning that Editors have often asked authors to shorten their papers during the review process. Given the complexity of many experimental designs, and the volume of information collected in many large scale observational studies, it is not feasible to include all methodological details or results within each paper. Authors have typically provided as much detail about how the study was conducted they felt necessary to interpret the results presented, with the promise that omitted details were available upon request if so desired.

But are these details really available upon request? There are many reasons to believe that at least some details would not be available upon request. For example, individuals may change careers and no longer have access to these materials; over time we all die and are not able to provide these details; issues with storage (of both physical and digital copies) can arise that render the details no longer accessible; people are busy and may not have the time to search for these details when asked. And the list goes on. As suggested by the readings for this week, it turns out that the majority of requested research details are not in fact available upon request! This includes access to data and analytic code, things that are important to be available given the non-trivial amount of statistical reporting errors in the literature. Nullius in verba implies that we should not take the word of researchers that say “available upon request”.

The paper by Simmons, Nelson, and Simonsohn (2011) in Psychological Science, as well as the interesting information presented on www.flexiblemeasures.com, also suggest that evaluation of scientific claims requires knowing what research decisions (e.g., hypotheses, use of measures, data analytic plans) were made prior to collecting data and/or data analyses versus after. For example, regarding the use of the Competitive Reaction Time Task (CRTT) in research assessing aggressive behavior, it has been shown that responses to this task have been quantified for data analysis in 156 different ways within 130 academic publications (http://www.flexiblemeasures.com/crtt/)!  It seems likely that some of the decisions regarding how to quantify the CRTT within these publications were made during the process of data analysis (i.e., using a data-driven approach to determine a quantification strategy that yielded “significant” results), but it is impossible to determine which ones in the absence of pre-registration of data analytic strategies. Simmons et al. (2011) also demonstrate how the application of different Questionable Research Practices (QRPs) (e.g., checking for significance, collect more data, check again; trying out a few different DVs; adding covariates and/or moderators) can inflate the Type I error rate well beyond the 5% level typically adopted by researchers using a frequentist data analytic approach. We need to know when decisions were made during the research process, and why, in order to properly evaluate the reported results of the research.

Current technology allows for sharing important details of the research process as the research is being considered, conducted, analyzed, and written up into a manuscript. We are no longer limited by how many print pages a traditional journal is budgeted to print each year and then mail to subscribers. We are also not limited with respect to when we share details of the research process—we no longer have to wait to publish everything we can squeeze into a manuscript.

This is an exciting time to be a scientist, but of course it is scary for some scientists to consider all of these changes occurring in a fairly short period of time. In my opinion, greater openness and transparency of the research process is the future of science. There is so much that is being done, and that can be done, to ensure our research is open and reproducible. Next week we start at the beginning of the research process.

Open and Reproducible Science—Introduction to my new Graduate Course

In 2014 my lab began the transition to using open and reproducible research practices (I wrote a blog post about it). Almost a year later, and after a steep learning curve, I realized I needed to organize my open science. There was a lot of discussion within the field of psychology at that time on the idea of open science, but few suggestions on how to actually implement different open science practices throughout the research workflow for different types of research projects. Along with Etienne LeBel and Tim Loving we thought it through, published a paper in December 2014 with some specific recommendations (or so they seemed like it at the time), and then our lab made it up as we went along. To my pleasant surprise I was asked to give a workshop on “doing open science” in November 2015 at the University of Toronto, Mississauga. I really enjoyed talking to faculty and students about this topic, and I was honored to be asked to give similar workshops/talks at many different places during the next two years. Overall, I have now given 14 presentations on open and reproducible science in Canada, the USA, New Zealand, and Turkey (at the bottom of this post there is a list of these talks, including links to slides and recordings where available). I am also very happy to see that in the 3 or so years since publishing our open science recommendations, many journals in the field of psychology are changing their editorial policies to align with open science practices.

As I developed and tweaked my slides over this two year period, I learned a lot more about (a) what is being done in different fields to enhance research transparency and reproducibility, and (b) what can be done with existing technology. With all of this information, I decided to create a new graduate course called “Open and Reproducible Science” so I could share with trainees how they can begin their research career in a way that makes their future publications more open to evaluation and more likely to be reproducible (in many ways) by others (something I suggested was lacking in our graduate training programs here). I put together a syllabus and solicited feedback via Twitter. I received many helpful suggestions, as well as two offers for guest lectures—one by Seth Green from CodeOcean.com, and on from Joanne Paterson, a librarian at Western University. Click here to see what I ended up putting together for this inaugural course. I am excited that 16 grad students from different areas of my psychology department enrolled for this course, beginning January 11, 2018.

My goal is to write expanded lecture notes in a blog post for each week of the class. In these posts I will discuss my planned talking points for each class, as well as flesh out specific examples of how one might use different open science practices throughout the research workflow. Ok, now I need to go re-read the assigned article for the first class: Munafo et al’s (2017) “A manifesto for reproducible science”.

Invited Talks on Open Science and Replication

2015

  • November 3, Workshop presented at the University of Toronto, Mississauga (Psychology), Canada

2016

  • January 28, Pre-Conference of the Society of Personality and Social Psychology (SPSP), San Diego, USA
  • June 10, Conference of the Canadian Psychological Association, Victoria, Canada
  • October 3, York University (Psychology), Canada (audio recording)
  • October 11, University of Toronto (Psychology), Canada
  • October 19, University of Guelph (Family Relations and Applied Nutrition), Canada
  • October 21, Illinois State University, (Psychology), USA
  • November 11, Victoria University Wellington (Psychology), New Zealand
  • November 24, University of Western Ontario (Clinical Area), Canada
  • December 2, University of Western Ontario (Developmental Area), Canada

2017

  • January 19, Workshop presented at Sabanci University, Istanbul, Turkey (with thanks to a Travel Grant awarded to Asuman Buyuckan-Tetik and me from the European Association of Social Psychology)
  • March 10, Western Research Forum Panel Discussion on Open Access: “What’s in it for me?”, London, Canada
  • May 25, Workshop presented at the conference of the Association for Psychological Science (APS), Boston, USA
  • November 10, Plenary address, conference of the Society for the Scientific Study of Sexuality (SSSS), Atlanta, USA