Skip to main content

(index page)

Dash Updates: Fall, 2017

Throughout the summer the Dash team has focused on features that better integrate with researcher workflows. The goal: make data publishing as easy as possible.

With that, here are the releases now up on the Dash site. Please feel free to use our demo site dashdemo.ucop.edu to test features and practice submitting data.

So, what is Dash working on now?

In order to integrate with various aspects of the research workflows, Dash needs an open Rest API. The first API being built is a new deposit API. The team is talking with the repository community and gathering use cases for mapping out how Dash can integrate with journals & online lab notebooks for alternate ways of submitting data that are more in line with researcher workflows.

OA Week 2017: Maximizing the value of research

By John Borghi and Daniella Lowenberg

Happy Friday! This week we’ve defined open data, discussed some notable anecdotes, outlined publisher and funder requirements, and described how open data helps ensure reproducibility. To cap off open access week, let’s talk about one of the principal benefits of open data- it helps to maximize the value of research.

Research is expensive. There are different ways to break it down but, in the United States alone, billions of dollars are spent funding research and development every year. Much of this funding is distributed by federal agencies like the National Institutes of Health (NIH) and the National Science Foundation (NSF), meaning that taxpayer dollars are directly invested in the research process. The budgets of these agencies are under pressure from a variety of sources, meaning that there is increasing pressure on researchers to do more with less. Even if budgets weren’t stagnating, researchers would be obligated to ensure that taxpayer dollars aren’t wasted.

The economic return on investment for federally funded basic research may not be evident for decades and overemphasizing certain outcomes can lead to the issues discussed in yesterday’s post. But making data open doesn’t just refer to giving access other researchers, it also means giving taxpayers access to the research they paid for. Open data also enables reuse and recombination, meaning that a single financial investment can actually fund any number of projects and discoveries.

Research is time consuming. In addition to funding dollars, the cost of research can be measured in the hours it takes to collect, organize, analyse, document, and share data. “The time it takes” is one of the primary reasons cited when researchers are asked why they do not make their data open. However, while certainly takes time to ensure open data is organized and documented in such a way as to enable its use by others, making data open can actually save researchers time over the long run. For example, one consequence of the file drawer problem discussed yesterday is that researchers may inadvertently redo work already completed, but not published, by others. Making data open helps prevents this kind of duplication, which saves time and grant funding. However, the beneficiaries of open data aren’t just for other researchers- the organization and documentation involved in making data open can help researchers from having to redo their own work as well.

Research is expensive and time consuming for more than just researchers. One of the key principles for research involving human participants is beneficence– maximizing possible benefits while minimizing possible risks. Providing access to data by responsibly making it open increases the chances that researchers will be able to use it to make discoveries that result in significant benefits. Said another way, open data ensures that the time and effort graciously contributed by human research participants helps advance knowledge in as many ways as possible.


Making data open is not always easy. Organization and documentation take time. De-identifying sensitive data so that it can be made open responsibly can be less than straightforward. Understanding why doesn’t automatically translate into knowing how. But we hope this week we’ve given you some insight into the advantages of open data, both for individual researchers and for everyone that engages, publishes, pays for, and participates in the research process.

OA Week 2017: Transparency and Reproducibility

By John Borghi and Daniella Lowenberg

Yesterday we talked about about why researchers may have to make their data open, today let’s start talking about why they may want to.

Though some communities have been historically hesitant to do so, researchers appear to be increasingly willing to share their data. Open data even seems to be associated with a citation advantage, meaning that as datasets are accessed and reused, the researchers involved in the original work continue to receive credit. But open data is about more than just complying with mandates and increasing citation counts, it’s also about researchers showing their work.

From discussions about publication decisions to declarations that “most published research findings are false”, concerns about the integrity of the research process go back decades. Nowadays, it is not uncommon to see the term “reproducibility” applied to any effort aimed at addressing the misalignment between good research practices, namely those emphasizing transparency and methodological rigor, and academic reward systems, which generally emphasize the push to publish only the most positive and novel results. Addressing reproducibility means addressing a range of issues related to how research is conducted, published, and ultimately evaluated. But, while the path to reproducibility is a long one, open data represents a crucial step forward.

“While the path to reproducibility is a long one, open data represents a crucial step forward.”

One of the most popular targets of reproducibility-related efforts is p-hacking, a term that refers to the practice of applying different methodological and statistical techniques until non-significant results become significant. The practice of p-hacking is not always intentional, but appears to be quite common. Even putting aside some truly astonishing headlines, p-hacking has been cited as a major contributor to the reproducibility crisis in fields such as psychology and medicine.

One application of open data is sharing the datasets, documentation, and other materials needed to reproduce the results described in a journal article, thus allowing other researchers (including peer reviewers) can check for errors and ensure that the conclusions discussed in the paper are supported by the underlying data and methods. This type of validation doesn’t necessarily prevent p-hacking, but it does increase the degree to which researchers are accountable for explaining marginally significant results.

But the impact of open data on reproducibility goes far beyond just combatting p-hacking. Publication biases such as the file drawer problem, which refers to the tendency of researchers to publish papers describing studies that resulted in positive results while regulating studies that resulted in negative or nonconfirmatory results to the proverbial file drawer. Along with problems related to small sample sizes, this tendency majorly skews the effects described in the scientific literature. Open data provides a means for opening the file drawer, allowing researchers to share all of their results- even those that are negative or nonconfirmatory.

“Open data provides a means for opening the file drawer, allowing researchers to share all of their results- even those that are negative or nonconfirmatory.”

Open data is about researchers showing their work, being transparent about their how they make their conclusions, and providing their data for others to use and evaluate. This allows for validation and helps combat common but questionable research practices like p-hacking. But open data also helps advance reproducibility efforts in a way that is less confrontational, but allowing researchers to open the file drawer and share (and get credit for) all of their work.

OA Week 2017: Policies, Resources, & Guidance

By John Borghi and Daniella Lowenberg

Yesterday, through quotes and anecdotes, we outlined reasons why researchers should consider making their data open. We’ll dive deeper into some of these reasons tomorrow and on Friday, but today we’re focused on mandates.

Increasingly funding agencies and scholarly publishers are mandating that researchers open up their data. Different agencies and publishers have different policies so, if you are a researcher, it can be difficult to understand exactly what you need to do and how you should go about doing it. To help, we’ve compiled a list of links and resources.

Funder Policy Guidance:

The links below outline US federal funding policies as well as non profit and private funder policies. We also recommend getting in touch with your Research Development & Grants office if you have any questions about how the policy may apply to your grant funded research.

US Federal Agency Policies:

http://datasharing.sparcopen.org/data

http://www.library.cmu.edu/datapub/sc/publicaccess/policies/usgovfunders

Global & Private Funder Policies:

https://www.cancer.gov/research/key-initiatives/moonshot-cancer-initiative/funding/public-access-policy

https://www.gatesfoundation.org/How-We-Work/General-Information/Open-Access-Policy

https://wellcome.ac.uk/funding/managing-grant/policy-data-software-materials-management-and-sharing


Publisher Policy Guidance:

Below are a list of publishers that oversee thousands of the world’s journals and their applicable data policies. If you have questions about how to comply with these policies we recommend getting in touch with the journal you are aiming to submit to during the research process or before submission to expedite peer review and comply with journal requirements. It is also important to note that if the journal you are submitting to requires data to be publicly available this means that the data underlying the results and conclusions of the manuscript must be submitted, not necessarily the entire study. These data are typically the values behind statistics, data extracted from images, qualitative excerpts, and data necessary to replicate the conclusions.

PLOS: http://journals.plos.org/plosone/s/data-availability

Elsevier: https://www.elsevier.com/about/our-business/policies/research-data#Policy

Springer-Nature: https://www.springernature.com/gp/authors/research-data-policy/springer-nature-journals-data-policy-type/12327134

PNAS: http://www.pnas.org/site/authors/editorialpolicies.xhtml#xi

Wiley: https://authorservices.wiley.com/author-resources/Journal-Authors/licensing-open-access/open-access/data-sharing.html


Resources, Services, and Tools (The How)

Thinking about and preparing your data for publication and free access requires planning before and during the research process. Check out the free Data Management Plan (DMP) Tool: www.dmptool.org

For researchers at participating UC campuses, earth science and ecology (DataONE), and researchers submitting to the UC Press journals Elementa and Collabra, check out Dash, a data publishing platform: dash.ucop.edu

We also recommend checking out www.re3data.org and https://fairsharing.org for standards in your field and repositories both in your field or generally that will help you meet funder and publisher requirements and make your data open.

If you are a UC researcher, click on the name of your campus below for library resources to support researchers with managing, archiving, and sharing research data

UC Davis

UC Berkeley

UC San Francisco

UC Merced

UC Santa Cruz

UC Santa Barbara

UCLA

UC Irvine

UC Riverside

UC San Diego

OA Week 2017: Stories & Testimonials

By John Borghi and Daniella Lowenberg

Because of the tools and services we offer, we here at UC3 spend a lot of time talking about how to make data open. But, for open access week, we’d also like to take some time to talk about why. We think this is best illustrated by comments we collected from the community as well as excerpts from publications and public statements:

Open Data in order to have a broader reach with your work

Dr. Jonathan Eisen (UC Davis): “Starting in about 2009, we started publishing “data papers” to go with our open release of genome sequence data. These papers just report on the generation of the genome data and not analysis of the data. And these data reports have led to a large number of citations for me and my collaborators. For example for the Genomic Encyclopedia of Bacteria and Archaea project, we have published > 100 genome sequence data reports and these have in total been cited at least a few thousand times.

It is a win win approach for us. We publish papers detailing the generation of open data, which in turn I believe makes people feel more comfortable using that data, when they use the data they cite the papers, and we get more academic and general credit for the data. In the past, when people used our data in Genbank when there was no specific paper on just that data set, people were less likely to cite it.”

Open Data in order to find cures

In order to “measure progress by improving patient outcomes, not just publications”, open data is a central feature of the Cancer Moonshot Initiative led by former vice president Joe Biden. Similarly, efforts like clinicaltrials.gov and healthdata.gov aim to expose high value data in the hopes of facilitating better health outcomes.

Open Data in order to aid with the peer review process

Meghan Byrne (Senior Editor, PLOS ONE): “In our experience at PLOS ONE, making data openly available to the reviewers can help move the review process forward more quickly, particularly if the data are clearly reported, with the relevant metadata. In fact, we find that an increasing number of Academic Editors and reviewers are requesting to see the data, so having them ready at the time of submission can help reduce the time to publication. Once the paper is published, making the data publicly available increases the overall impact of the work.”

Open Data in order to advance scientific discovery

Open data from the Compact Muon Solenoid (CMS) at CERN’s Large Hadron Collider was recently used by researchers outside the organization to confirm a hypothesis about quantum chromodynamics (read more here). Though this is only one example, it is demonstrative of the immense potential for open data to facilitate discovery as new methods and analyses are applied to old data.

Open Data in order to extend the value of research investment

Carly Strasser (Moore Foundation): “We want research that we fund to be widely available. Free and open access to the research outputs that we fund is critical for ensuring maximum impact.”

Welcome to OA Week 2017!

By John Borghi and Daniella Lowenberg

It’s Open Access week and that means it’s time to spotlight and explore Open Data as an essential component to liberating and advancing research.

Let’s Celebrate!

Who: Everyone. Everyone benefits from open research. Researchers opening up their data provides access to the people who paid for it (including taxpayers!), patients, policy makers, and other researchers who may build upon it and use it to expedite discoveries.

What: Making data open means making it available for others to use and examine as they see fit. Open data is about more than just making the data available on its own, it is also about opening up the tools, materials, and documentation that describes how the data were collected and analyzed and why decisions about the data were made.

When: Data can be made open anytime a paper is published, anytime null or negative results are found, anytime data are curated. All the open data, all the time.

Where: If you are a UC researcher, resources free to you are available at each of your campuses Research Data Management library websites. Dash is a data publication platform to make your data open and archived for participating UC campuses, UC Press, and DataONE’s ONEShare. For more open data resources, check out our upcoming post on Wednesday, October 25th.

Why: Data are what support conclusions, discoveries, cures, and policies. Opening up articles for free access to the world is very important, but the articles are only so valuable without the data that went into them.

Follow this week as we cover policies, user stories, resources, economics, and justifications for why researchers should all be making their (de-identified, IRB approved) data freely available.

Tweet to us @UC3CDL with any questions, comments, or contributions you may have.

Upcoming Posts

Tuesday, October 24th: Open Data in Order to… Stories & Testimonials

Wednesday, October 25th: Policies, Resources, & Guidance on How to Make Your Data Open

Thursday, October 26th: Open Data and Reproducibility

Friday, October 27th: Open Data and Maximizing the Value of Research

Doing it Right: Get Credit for Your Research

Join research data specialists from University of California Curation Center to talk about planning, publishing, and getting your data out there.

When: Friday, November 3rd 2:00pm

Where: BIDS, UC Berkeley Doe Library

There will be snacks.

Co-Author ORCiDs in Dash

Recently, the Dash team enabled ORCiD login. And while this configuration is important for primary authors, the Dash team feels strongly that all contributors to data publications should get credit for their work.

All co-authors of a published dataset now have the ability to authenticate and attach their ORCiD in Dash.

How this works:

  1. Data are published by a corresponding author who has the ability to authenticate their own ORCiD but they cannot enter other ORCiDs for co-authors. Bearing this in mind, Dash has a space for co-author email addresses to be entered.
  2. If email addresses are entered for co-authors, upon publication of the data, co-authors will receive an email notification. This notification will have a note about ORCiD iDs and a URL that directs to Dash.
  3. Co-authors who have clicked on this URL will be directed to a pop-up box over the dataset landing page which navigates authors to ORCiD for login and authentication
  4. After an ORCiD iD is entered and authenticated, the author is returned to the Dash landing page for their dataset and their ORCiD ID will appear by their name.

 

Dash Enables ORCiD Login

The Dash team has now added a second way to login and submit. In addition to using Single Sign-On, users now have the ability to login with ORCiD. This means that not only can you authenticate with ORCiD, but once you have logged in this way, your ORCiD ID will connect to your Dash account. The next times that you submit to Dash, your ORCiD ID will auto populate in your submission form.

To back-up a little: ORCiD is a persistent identifier used to distinguish researchers from one another, and connect researchers with their research. If you are a researcher and do not currently have an ORCiD, sign up!

To connect your ORCiD:

  1. Login using the button on the far right of the Dash homepage
  2. Here you will see two options. If you click on the top ORCiD button will send you out to the ORCiD authentication page, and after correctly entering your ORCiD info, send you back to Dash.
    Screen Shot 2017-08-17 at 10.04.30 AM
  3. Although you have now successfully authenticated with ORCiD, to ensure you are connected to your correct submitting instance (a campus, a department, DataONE, etc…) you will be asked to choose your Single Sign-On. This is the only time you will be asked to login twice.Screen Shot 2017-08-17 at 10.14.22 AM
  4. After successfully logging in with Single Sign-On you will have your account connected to your ORCiD. In the future, you will not need to repeat this process and instead you will either be able to save your login to your browser or choose one of the two options for logging in.If you have already submitted to Dash before, you may logout, and go through the same steps above. This process will tie your ORCiD to your existing account and allow for either ORCiD or Single Sign-On in the future.

Dash: The Data Publication Tool for Researchers

This post has been crossposted on Medium

We all know that research data should be archived and shared. That’s why Dash was created, a Data Publishing platform free to UC researchers. Dash complies with journal and funder requirements, follows best practices, and is easy to use. In addition, new features are continuously being developed to better integrate with your research workflow.

Why is Dash the best solution for UC researchers:

We hear a lot about the cost of storage being an inhibitor. But, on many campuses, the storage costs associated with Dash are subsidized by academic libraries or departments. The cost of storage could also be written into grants (as funders do require data to be archived).

We are always looking for feedback on what features would be the most useful, so that we can make data publishing a part of your normal workflows. Get in touch with us or start using Dash to archive and share your data.