I am a newcomer to the world of libraries and information science. Being the new kid on the block is a familiar feeling for me: I have always been one to seek out new and interesting approaches to my questions, whether it be learning mathematical modeling and genetic sequencing techniques to explore clam populations, or exploring the ins and outs of Freeganism, I like learning new things. So when I began working with questions related to scientific data, data sharing, and data reuse, I was comfortable asking “what does curation mean?”.
In case you are in the same boat as I was, this post seeks out a good description for “data curation”. First, let’s start with curation in general. It originates from the latin word cura for “care”. Most people have heard the word in reference to museums (e.g. a museum curator). These curators are charged with caring for museum collections; there are also art curators that focus on curating art collections.
A description specific to scientific data curation from Data Conservancy:
Data curation is a means to collect, organize, validate, and preserve data so that scientists can find new ways to address the grand research challenges that face society.
This description definitely touches on what we are interested in facilitating with the Excel add-in: most notably organization, and preservation of data. Here is another take summarized from Wikipedia: data curation entails
- Collecting verifiable data
- Providing capabilities for data search and retrieval
- Ensuring integrity of collected data
- Ensuring semantic and ontological continuity (i.e. making sure the data are described in a consistent way)
This description makes data curation sound like something reserved for libraries and data centers to tackle, which isn’t necessarily the case. I like the description laid out by the Digital Curation Centre based out of the UK because it touches on broader concepts related to curation. For them, data curation comprises:
- Data management
- Adding value to data (perhaps this means adding good contextual metadata?)
- Data sharing for reuse
- Data preservation for later re-use
This description seems to coincide nicely with the goals of the DCXL project: we are interested in helping scientists with each of the four points above. Really, data curation means managing, describing, and preserving your data so others might reuse it. So in summary: Go Forth and Curate! (I couldn’t resist this Flickr result for the search “curation”):