We Can’t Succeed Alone

Daniella Lowenberg, December 18, 2018

Posted in: Data Publication, Dryad, Open Access

Within the realm of research data management, libraries spend resources building and providing tools that are not within researcher workflows and/or are not aligned with researcher values. By doing this, we are setting ourselves up for failure.

As mentioned in previous posts, part of my work is focused on incentivizing UC researchers to publish their data. And it has taught me that we are failing as an institutional and research stakeholder community to move the needle on the adoption of data publishing. Why? Because we are not listening to researchers. To get to the root of this issue, I have spent hours working with researchers to determine which features would get them to deposit, what we could do differently to get them to change their processes, etc. Unsurprisingly, a common response has been that integrations within existing workflows (ones that researchers are already familiar with or those driven by the research process) would be key.

Why is adoption key?

You may ask what I mean by “adoption of data publishing” and why it is so key. Of course, there are natural answers – we want our tools and services to be used. But for me, the adoption we need to be focused on goes deeper to mean more than just awareness or usage of tools. It should mean that researchers are publishing their data and valuing their data as a first-class research output. Of course article publishing has its flaws that we would like to avoid repeating, but researchers value articles like currency. Research data are the science underlying those articles and they deserve to be valued, as well.

Adoption of data publishing also means quality, curated datasets have understandable metadata specific to the dataset and preservation assurance that the data will persist. These are features that we as institutions value. These are the features that institutions include in their own repositories. These value points are why we would not consider a PDF of an Excel sheet or a copy of summary statistics a data publication. Successful adoption means a culture change where researchers are publishing their understandable and usable research data (employing institutional and community best practices), citing their data, and valuing their data publications like their articles.

We are nowhere near this type of success. However, it is essential that we continue to work together to drive this kind of adoption. If we do not work together we will continue to build resources full of subsets of these features, maybe meeting our institutional needs but that that are not adopted by researchers. And if we fail at this adoption campaign, we fail to call the open data movement a success.

Powerful adoption goals

With this view of adoption as my guide, I had our Dash dev team build a robust API that could handle different types of integrations that researchers need. I went to publishers, online lab notebooks, computing spaces, and showed them how easy it would be to integrate with our tools. I was trying to break this adoption logjam and it was my hope that acting on behalf of UC, the largest research institution in the world, I would be able to deliver on our researcher’s feedback. I set goals for new deposits in the thousands and thought we had the answers that could get us to meet those targets. But what I found is another hurdle: that not even UC has the scale for an outside vendor to be interested in integrating with. And after one year of trying, we were nowhere near reaching our adoption goals.

CDL is not alone. At CNI last week I asked the room of institutional stakeholders if any of their institutional platforms had more than 500 deposits. Not one of the 40+ people in the room raised their hand. This begs the question: If we are spending years of time for a couple hundred deposits, how can any of our institutions call this a success? All of us at institutions need to be self-reflective and evaluate ourselves against realistic success metrics. CDL has been in this boat. While my team did not want to hear that we had failed to meet our goals, after years of building an awesome platform for the UCs, it was the truth.

So we took a step back. We re-evaluated the feedback from our UC researchers and regrouped on whether we should put energy into a project destined for minimal success. The new question we asked ourselves: How can we continue to support our institutional values for research data publishing and get to a scale we want?

With truth acknowledgement brings new ventures

We spent a lot of time rethinking our motivations as libraries and as research institutions: we want high quality datasets, we want the most PID-ified deposits possible, we want scale, we want ease of use, we want integrations, etc. We also spent time thinking about the motivations of our researchers: they want ease of use and low friction with piece of mind that they are meeting requirements and doing the right thing.

Along the journey we concluded that serving just one institutional community was not a plausible way to drive adoption of data publishing, and my team began to look at successes in our wider community. We did a repository comparison based on features and values. What became extremely clear was that Dryad was and is clearly aligned what we were trying to achieve. Not only does Dryad have a similar mission statement, but Dryad has an undisputable amount of adoption and researcher support. They are part of the researcher workflow and they are focused on high-quality, curated datasets. These are all the things we all want, right?

As a matter of fact, when researchers at UC had said “your UC specific dash platform sounds cool but can I just put my data in Dryad like my collaborators?”, I would say yes, because publishing data in Dryad is better than fighting over data territory. But as we were looking for ways of achieving the scale and success that Dryad had achieved, we realized that what was needed was us to better support Dryad as part of our institutional solution. So, after months of vetting and discussions on both sides, CDL decided to partner with Dryad. And, in this partnership, I get to walk the walk. Dryad’s submissions are on the rise and researchers have long supported Dryad.

Building for success

What we were missing was relevance and scale. What Dryad was missing was enhanced and innovative features to optimize their existing connections into researcher workflows. But as we embark on our new Dryad service we need to be reminded that the other differentiation of Dryad is that it works for researchers. And we need to remember that we need scale…so join us!

While we plan to leverage the new partnership to help UC and all member institutions to grow adoption, we can’t build solely for our library and scholarly communications needs. Key word here is solely. Of course we as a library community should proudly lead the way in bringing better metadata, metrics, and infrastructure to support the discoverability, and use, and preservation of research outputs to the table. But if our end goal is to support researchers we cannot prioritize our build out of services in a researcher-less silo.

The new Dryad

This means that the new Dryad should be able to connect into institutions like UC, reflecting their values, while remaining a researcher-focused service. As product manager, my goal for this new Dryad service is:

To focus all of our energy on adoption
To build out each feature to be user centered and tested
To be a seamless data publishing platform, integrated into research and publishing workflows
To add institution tools in ways that are transparent to and benefit the researcher

What does this all require? I need to ensure that we are on the pulse of publisher data policy workflows, integrating with computing spaces and notebooks, and continually in alignment with funder requirements (and thinking ahead to integrations with preprints and other funder required spaces). This is where our library and scholarly communications expertise comes in. If we can build up an open community of support to ensure we are only building for best practices, we can ensure that our features and services are two fold: instilled by our values and displayed in the easiest way for submitters.

Vision for the community

So let’s build an open community. A community of supporters who would like to focus strictly on researcher adoption of best data practices. No jargon. No discussions about back-end technology. A community that leverages each others projects that are aligned in values. A community that embraces diverse set of tools to help grow adoption.

Let’s follow the principles of the Supporters Guide. It is possible for us to all work together, regardless of technologies, to ensure that our services and offerings in the research supporting community is bound by our values and built for researcher use. More on that in a future post.