I advocate for open science. I love the word open and all of the things that his word implies for science. In keeping with last week’s post exploring what “data curation” means, here I touch very briefly on what open science means, and how it relates to the Excel add-in we are developing. Let me admit up front that I’m certainly not the expert on this subject. There is a great post at The Open Science Project’s blog, KQED (San Francisco’s public media outlet) recently ran an article that discusses the topic, and even the prestigious journal Nature (ironically for-profit) ran an article about the benefits of open research in chemistry. A quick Internet search for “open science” will give you a wealth of resources on exploring this topic more.
So what’s the big deal with open science? I argue that it harkens back to one of the most foundational pillars of science: reproducibility. If no one else is able to recreate your results, then how are we to believe you? The current system for scholarly communication relies on journal publications, which succinctly summarize the immense amount of work for a given project into a 5-20 page manuscript. The chances of recreating results from a journal article alone are effectively zero. Over the course of scientific history all indications are that people have been relatively honest in their reporting of experimental results and observations (otherwise we wouldn’t have progressed this far). But wouldn’t it be nice to know for sure that the science was good?
Here’s where open science is such a fabulous idea. It suggests that rather than limiting our scholarly communication to the publication of a few journal articles every year, let’s communicate on a daily basis. The Internet makes this possible with very little effort, and lays the foundation for rapid advancement of science. The basic idea is that you expose as much of your thought process as possible to the public. This may be by keeping an online lab notebook (via WordPress or OpenWetWare), publishing your code for scripted analysis, sharing your data and workflow online, or taking advantage of the many sites that facilitate open science like rOpenSci, FigShare, SlideShare, ecologicaldata.org, myExperiment, EcologicalWebsDatabase, GeoCommons, and many, many others. By making as much of your work public as possible, you reap the benefits like
- public comments and suggestions on your work
- increasing opportunities for collaboration
- having a rebuttal for naysayers (“Go download my data and code if you don’t believe me”)
Of course, open science isn’t solely about reproducibility. It also ensures/enables
- Trustworthiness
- Sharing methods and workflows
- Sharing data
- Making science possible for anyone, regardless of financial resources
- Promoting a community that is working towards common goals
How does the Excel add-in promote open science? First, the add-in is intended to be open source. That means the code will be available for developers to take and mold into something they might think is more useful for their purposes- that is, they can reuse the code. The add-in will also facilitate data sharing since it will streamline the process for data preparation and submission to archives. Also, it will be free to download, maximizing its accessibility for Excel users.
Open science is a good thing. Exposure might be a bit scary at first, but we all stand to benefit from shedding light on our work, our thought process, and our data.