Did you know that data management plans existed before the NSF started requiring them?? I know, it’s shocking. But they have inherent value despite their being relatively unknown to researchers until now. Proper, thorough data management plans (DMPs) are potentially a major time saver and a huge asset for the project. Funders tend to have minimal requirements for DMPs (e.g., a mere two pages allowed for an NSF proposal), and as a result researchers tend to underestimate the importance of the document. I’ve spoken to many researchers who wait until the last minute to start creating their DMP, and as a result their plans reflect their lack of knowledge about data stewardship and are not properly prepared for when data starts being generated by their project.
Here are a few ways to ensure you create a high-quality, thorough DMP:
You take advantage of experts. Librarians should be partners with the researcher in creating their data management plans. Librarians are information professionals, and their business is essentially figuring out how to manage and preserve information (i.e., data). Consult them regularly when creating a plan: even if they don’t fully understand your data, they know how to find good standards, appropriate repositories, and who to talk to on campus.
You take advantage of institutional resources, such as departmental servers, backup services, and IT professionals. Often researchers are unaware of the hardware and software available from their institutions; often the institutional services and resources are available at no or low cost.
You think carefully about your data, including considering file formats, common vocabularies, codes and metadata needed, and standards that will be used for metadata. This should be done as thoroughly as possible before any data are collected to prevent the need to go back and edit your datasets (i.e., the dreaded “find/replace” tasks).
You think carefully about your workflow and sketch out the plan for data processing and analysis. Workflows can be very informal, consisting of a simple flow chart (read my blog post about this). By considering the iterations of the data before you start collecting, you are more likely to arrange your files, datasets, and collection procedures in a logical way.
You know exactly where your data will be stored, both during the project and after the project is completed.
Perhaps most importantly, consider this: a data management plan should be created early on and should be revisited throughout the project. Add a reminder to your calendar – every six months, re-read your plan. Make sure new members of your lab group have read the plan and understand it. Make changes based on new developments in the project, and ensure that the work of archiving the data is not pushed entirely to the end of the project.
What about examples? There are lots of examples out there for two-page NSF DMPs:
- There are a few examples available on the DMPTool website, available under Funder Requirements
- Examples from the UC San Diego Library (for NSF)
- Examples from the University of Minnesota Libraries
- Examples from the University of New Mexico Libraries (left side)
But be sure to check out more extensive examples and resources too:
- Examples from ICPSR (Inter-university Consortium for Political and Social Research). There are some great DMPs at the bottom of this page, arranged by discipline.
- List of resources from the Digital Curation Centre (DCC), based out of the UK
- Examples from the DCC
Note that future development of the DMPTool will include a “DMP Library”, full of example DMPs where researchers can access others’ plans and share their DMPs. Now go forth and plan!