At long last, DataONE has gone live. For veterans of the DCXL/DataUp blog, you are probably well aware of the DataONE organization and project, but for newcomers I will provide a brief overview. Fine print: this is NOT the official DataONE stance on DataONE. This is merely my interpretation of it.
To explain DataONE, let’s have go through a little thought exercise. Let’s pretend I’m a researcher, starting a project on copepods in estuaries of the Pacific Northwest. I’m wondering who else has worked on them, what they have found, and whether I can use their data to help me parameterize my model. Any researcher will tell you the best way to do this is to start searching for relevant journal articles. I can then weave in and out of reference lists to hone in on the authors, topics, and species that might be of most use, continually refining my searches until I are satisfied.
Imagine I need the data from some of those articles I found. I look for datasets on the authors’ websites, in the papers themselves, and online. Some of the work was funded by NOAA, so I check there for data. I Google like crazy. Alas, the data are nowhere to be found.
In real life, this is where I ended my search and started contacting authors directly. Although I should have also checked data repositories, I didn’t. This was mostly because I wasn’t aware of them when I did this work back in 2008. Sadly, many researchers are in a similar state of ignorance that I was.
The good news is that there are A LOT of data repositories out there (check out Databib.org for an intimidating list). The bad news is it’s very difficult to know about and search all of the potential repositories with data you might want to use.
DataONE is all about linking together existing data repositories, allowing researchers to access, search, and discover all of the data through a single portal. It’s basically cyber-glue for the different data centers out there. The idea is that you go to the DataONE search engine (ONEMercury) and hunt for data. It tells you where the data are housed, gives you lots of metadata, and gives you access to data when the authors have allowed this.
But wait, there’s MORE!
DataONE is also all about providing tools for researchers to find, use, organize, and manage their data throughout the research life cycle. This is where DataUp connects with DataONE: DataUp will be part of the Investigator Toolkit, which also includes nifty things like the DMPTool, ONE-R (an R package for DataONE), and ONE-Drive (a Dropbox-esque way to look at data in DataONE, in production).
The exciting news this week is that DataONE’s search and discovery tool has gone live (check out the NSF press release or the DataONE press release). You can now start looking for data that might be housed in any participating repository. There are only a few data repositories (called member nodes in DataONE speak) currently on board, but the number is expected to increase exponentially over the coming years.
More questions about DataONE? I can help, or at least direct you to the person that can. Alternatively start poking around the DataONE website and ONEMercury, and give feedback so we can make it better.