(index page)
Dash Updates: Fall, 2017
Throughout the summer the Dash team has focused on features that better integrate with researcher workflows. The goal: make data publishing as easy as possible.
With that, here are the releases now up on the Dash site. Please feel free to use our demo site dashdemo.ucop.edu to test features and practice submitting data.
- Dash enabled co-author ORCiDs– all listed co-authors now have the ability to link their ORCiD iD with their data publication.
- Dash notifies “administrators” (set for each instance- campus data librarians & publishing staff) when data are deposited so researchers can get assistance enhancing their metadata (to make data more reproducible, transparent, and discoverable).
- Dash has rich text editing. The abstract, methods, and usage notes fields now have HTML text editors that allow for stylistic text editing to properly format information about the data publication.
- Dash allows for individual file download. All versions of the datasets may now be downloaded at the file-level and not just the entire dataset.
- Dash welcomes UC Davis. Researchers at UC Davis may now publish and share their research data at dash.ucdavis.edu.
- Dash welcomes UC Press journal Elementa. Authors submitting to the Elementa may now utilize UC Press Dash for all data supporting journal publications.
So, what is Dash working on now?
In order to integrate with various aspects of the research workflows, Dash needs an open Rest API. The first API being built is a new deposit API. The team is talking with the repository community and gathering use cases for mapping out how Dash can integrate with journals & online lab notebooks for alternate ways of submitting data that are more in line with researcher workflows.
OA Week 2017: Maximizing the value of research
By John Borghi and Daniella Lowenberg
Happy Friday! This week we’ve defined open data, discussed some notable anecdotes, outlined publisher and funder requirements, and described how open data helps ensure reproducibility. To cap off open access week, let’s talk about one of the principal benefits of open data- it helps to maximize the value of research.

Research is expensive. There are different ways to break it down but, in the United States alone, billions of dollars are spent funding research and development every year. Much of this funding is distributed by federal agencies like the National Institutes of Health (NIH) and the National Science Foundation (NSF), meaning that taxpayer dollars are directly invested in the research process. The budgets of these agencies are under pressure from a variety of sources, meaning that there is increasing pressure on researchers to do more with less. Even if budgets weren’t stagnating, researchers would be obligated to ensure that taxpayer dollars aren’t wasted.
The economic return on investment for federally funded basic research may not be evident for decades and overemphasizing certain outcomes can lead to the issues discussed in yesterday’s post. But making data open doesn’t just refer to giving access other researchers, it also means giving taxpayers access to the research they paid for. Open data also enables reuse and recombination, meaning that a single financial investment can actually fund any number of projects and discoveries.
Research is time consuming. In addition to funding dollars, the cost of research can be measured in the hours it takes to collect, organize, analyse, document, and share data. “The time it takes” is one of the primary reasons cited when researchers are asked why they do not make their data open. However, while certainly takes time to ensure open data is organized and documented in such a way as to enable its use by others, making data open can actually save researchers time over the long run. For example, one consequence of the file drawer problem discussed yesterday is that researchers may inadvertently redo work already completed, but not published, by others. Making data open helps prevents this kind of duplication, which saves time and grant funding. However, the beneficiaries of open data aren’t just for other researchers- the organization and documentation involved in making data open can help researchers from having to redo their own work as well.
Research is expensive and time consuming for more than just researchers. One of the key principles for research involving human participants is beneficence– maximizing possible benefits while minimizing possible risks. Providing access to data by responsibly making it open increases the chances that researchers will be able to use it to make discoveries that result in significant benefits. Said another way, open data ensures that the time and effort graciously contributed by human research participants helps advance knowledge in as many ways as possible.
Making data open is not always easy. Organization and documentation take time. De-identifying sensitive data so that it can be made open responsibly can be less than straightforward. Understanding why doesn’t automatically translate into knowing how. But we hope this week we’ve given you some insight into the advantages of open data, both for individual researchers and for everyone that engages, publishes, pays for, and participates in the research process.
OA Week 2017: Transparency and Reproducibility
By John Borghi and Daniella Lowenberg
Yesterday we talked about about why researchers may have to make their data open, today let’s start talking about why they may want to.
Though some communities have been historically hesitant to do so, researchers appear to be increasingly willing to share their data. Open data even seems to be associated with a citation advantage, meaning that as datasets are accessed and reused, the researchers involved in the original work continue to receive credit. But open data is about more than just complying with mandates and increasing citation counts, it’s also about researchers showing their work.
From discussions about publication decisions to declarations that “most published research findings are false”, concerns about the integrity of the research process go back decades. Nowadays, it is not uncommon to see the term “reproducibility” applied to any effort aimed at addressing the misalignment between good research practices, namely those emphasizing transparency and methodological rigor, and academic reward systems, which generally emphasize the push to publish only the most positive and novel results. Addressing reproducibility means addressing a range of issues related to how research is conducted, published, and ultimately evaluated. But, while the path to reproducibility is a long one, open data represents a crucial step forward.
“While the path to reproducibility is a long one, open data represents a crucial step forward.”
One of the most popular targets of reproducibility-related efforts is p-hacking, a term that refers to the practice of applying different methodological and statistical techniques until non-significant results become significant. The practice of p-hacking is not always intentional, but appears to be quite common. Even putting aside some truly astonishing headlines, p-hacking has been cited as a major contributor to the reproducibility crisis in fields such as psychology and medicine.
One application of open data is sharing the datasets, documentation, and other materials needed to reproduce the results described in a journal article, thus allowing other researchers (including peer reviewers) can check for errors and ensure that the conclusions discussed in the paper are supported by the underlying data and methods. This type of validation doesn’t necessarily prevent p-hacking, but it does increase the degree to which researchers are accountable for explaining marginally significant results.
But the impact of open data on reproducibility goes far beyond just combatting p-hacking. Publication biases such as the file drawer problem, which refers to the tendency of researchers to publish papers describing studies that resulted in positive results while regulating studies that resulted in negative or nonconfirmatory results to the proverbial file drawer. Along with problems related to small sample sizes, this tendency majorly skews the effects described in the scientific literature. Open data provides a means for opening the file drawer, allowing researchers to share all of their results- even those that are negative or nonconfirmatory.
“Open data provides a means for opening the file drawer, allowing researchers to share all of their results- even those that are negative or nonconfirmatory.”
Open data is about researchers showing their work, being transparent about their how they make their conclusions, and providing their data for others to use and evaluate. This allows for validation and helps combat common but questionable research practices like p-hacking. But open data also helps advance reproducibility efforts in a way that is less confrontational, but allowing researchers to open the file drawer and share (and get credit for) all of their work.
OA Week 2017: Policies, Resources, & Guidance
By John Borghi and Daniella Lowenberg
Yesterday, through quotes and anecdotes, we outlined reasons why researchers should consider making their data open. We’ll dive deeper into some of these reasons tomorrow and on Friday, but today we’re focused on mandates.
Increasingly funding agencies and scholarly publishers are mandating that researchers open up their data. Different agencies and publishers have different policies so, if you are a researcher, it can be difficult to understand exactly what you need to do and how you should go about doing it. To help, we’ve compiled a list of links and resources.

Funder Policy Guidance:
The links below outline US federal funding policies as well as non profit and private funder policies. We also recommend getting in touch with your Research Development & Grants office if you have any questions about how the policy may apply to your grant funded research.
US Federal Agency Policies:
http://datasharing.sparcopen.org/data
http://www.library.cmu.edu/datapub/sc/publicaccess/policies/usgovfunders
Global & Private Funder Policies:
https://www.gatesfoundation.org/How-We-Work/General-Information/Open-Access-Policy
https://wellcome.ac.uk/funding/managing-grant/policy-data-software-materials-management-and-sharing
Publisher Policy Guidance:
Below are a list of publishers that oversee thousands of the world’s journals and their applicable data policies. If you have questions about how to comply with these policies we recommend getting in touch with the journal you are aiming to submit to during the research process or before submission to expedite peer review and comply with journal requirements. It is also important to note that if the journal you are submitting to requires data to be publicly available this means that the data underlying the results and conclusions of the manuscript must be submitted, not necessarily the entire study. These data are typically the values behind statistics, data extracted from images, qualitative excerpts, and data necessary to replicate the conclusions.
PLOS: http://journals.plos.org/plosone/s/data-availability
Elsevier: https://www.elsevier.com/about/our-business/policies/research-data#Policy
Springer-Nature: https://www.springernature.com/gp/authors/research-data-policy/springer-nature-journals-data-policy-type/12327134
PNAS: http://www.pnas.org/site/authors/editorialpolicies.xhtml#xi
Resources, Services, and Tools (The How)
Thinking about and preparing your data for publication and free access requires planning before and during the research process. Check out the free Data Management Plan (DMP) Tool: www.dmptool.org
For researchers at participating UC campuses, earth science and ecology (DataONE), and researchers submitting to the UC Press journals Elementa and Collabra, check out Dash, a data publishing platform: dash.ucop.edu
We also recommend checking out www.re3data.org and https://fairsharing.org for standards in your field and repositories both in your field or generally that will help you meet funder and publisher requirements and make your data open.
If you are a UC researcher, click on the name of your campus below for library resources to support researchers with managing, archiving, and sharing research data

OA Week 2017: Stories & Testimonials
By John Borghi and Daniella Lowenberg
Because of the tools and services we offer, we here at UC3 spend a lot of time talking about how to make data open. But, for open access week, we’d also like to take some time to talk about why. We think this is best illustrated by comments we collected from the community as well as excerpts from publications and public statements:
Open Data in order to have a broader reach with your work
Dr. Jonathan Eisen (UC Davis): “Starting in about 2009, we started publishing “data papers” to go with our open release of genome sequence data. These papers just report on the generation of the genome data and not analysis of the data. And these data reports have led to a large number of citations for me and my collaborators. For example for the Genomic Encyclopedia of Bacteria and Archaea project, we have published > 100 genome sequence data reports and these have in total been cited at least a few thousand times.
It is a win win approach for us. We publish papers detailing the generation of open data, which in turn I believe makes people feel more comfortable using that data, when they use the data they cite the papers, and we get more academic and general credit for the data. In the past, when people used our data in Genbank when there was no specific paper on just that data set, people were less likely to cite it.”
Open Data in order to find cures
In order to “measure progress by improving patient outcomes, not just publications”, open data is a central feature of the Cancer Moonshot Initiative led by former vice president Joe Biden. Similarly, efforts like clinicaltrials.gov and healthdata.gov aim to expose high value data in the hopes of facilitating better health outcomes.
Open Data in order to aid with the peer review process
Meghan Byrne (Senior Editor, PLOS ONE): “In our experience at PLOS ONE, making data openly available to the reviewers can help move the review process forward more quickly, particularly if the data are clearly reported, with the relevant metadata. In fact, we find that an increasing number of Academic Editors and reviewers are requesting to see the data, so having them ready at the time of submission can help reduce the time to publication. Once the paper is published, making the data publicly available increases the overall impact of the work.”
Open Data in order to advance scientific discovery
Open data from the Compact Muon Solenoid (CMS) at CERN’s Large Hadron Collider was recently used by researchers outside the organization to confirm a hypothesis about quantum chromodynamics (read more here). Though this is only one example, it is demonstrative of the immense potential for open data to facilitate discovery as new methods and analyses are applied to old data.
Open Data in order to extend the value of research investment
Carly Strasser (Moore Foundation): “We want research that we fund to be widely available. Free and open access to the research outputs that we fund is critical for ensuring maximum impact.”
Welcome to OA Week 2017!
By John Borghi and Daniella Lowenberg
It’s Open Access week and that means it’s time to spotlight and explore Open Data as an essential component to liberating and advancing research.

Let’s Celebrate!
Who: Everyone. Everyone benefits from open research. Researchers opening up their data provides access to the people who paid for it (including taxpayers!), patients, policy makers, and other researchers who may build upon it and use it to expedite discoveries.
What: Making data open means making it available for others to use and examine as they see fit. Open data is about more than just making the data available on its own, it is also about opening up the tools, materials, and documentation that describes how the data were collected and analyzed and why decisions about the data were made.
When: Data can be made open anytime a paper is published, anytime null or negative results are found, anytime data are curated. All the open data, all the time.
Where: If you are a UC researcher, resources free to you are available at each of your campuses Research Data Management library websites. Dash is a data publication platform to make your data open and archived for participating UC campuses, UC Press, and DataONE’s ONEShare. For more open data resources, check out our upcoming post on Wednesday, October 25th.
Why: Data are what support conclusions, discoveries, cures, and policies. Opening up articles for free access to the world is very important, but the articles are only so valuable without the data that went into them.
Follow this week as we cover policies, user stories, resources, economics, and justifications for why researchers should all be making their (de-identified, IRB approved) data freely available.
Tweet to us @UC3CDL with any questions, comments, or contributions you may have.
Upcoming Posts
Tuesday, October 24th: Open Data in Order to… Stories & Testimonials
Wednesday, October 25th: Policies, Resources, & Guidance on How to Make Your Data Open
Thursday, October 26th: Open Data and Reproducibility
Friday, October 27th: Open Data and Maximizing the Value of Research
Doing it Right: Get Credit for Your Research
Join research data specialists from University of California Curation Center to talk about planning, publishing, and getting your data out there.
When: Friday, November 3rd 2:00pm
Where: BIDS, UC Berkeley Doe Library
There will be snacks.
Co-Author ORCiDs in Dash
Recently, the Dash team enabled ORCiD login. And while this configuration is important for primary authors, the Dash team feels strongly that all contributors to data publications should get credit for their work.
All co-authors of a published dataset now have the ability to authenticate and attach their ORCiD in Dash.
How this works:
- Data are published by a corresponding author who has the ability to authenticate their own ORCiD but they cannot enter other ORCiDs for co-authors. Bearing this in mind, Dash has a space for co-author email addresses to be entered.
- If email addresses are entered for co-authors, upon publication of the data, co-authors will receive an email notification. This notification will have a note about ORCiD iDs and a URL that directs to Dash.
- Co-authors who have clicked on this URL will be directed to a pop-up box over the dataset landing page which navigates authors to ORCiD for login and authentication
- After an ORCiD iD is entered and authenticated, the author is returned to the Dash landing page for their dataset and their ORCiD ID will appear by their name.
Dash Enables ORCiD Login
The Dash team has now added a second way to login and submit. In addition to using Single Sign-On, users now have the ability to login with ORCiD. This means that not only can you authenticate with ORCiD, but once you have logged in this way, your ORCiD ID will connect to your Dash account. The next times that you submit to Dash, your ORCiD ID will auto populate in your submission form.
To back-up a little: ORCiD is a persistent identifier used to distinguish researchers from one another, and connect researchers with their research. If you are a researcher and do not currently have an ORCiD, sign up!
To connect your ORCiD:
- Login using the button on the far right of the Dash homepage
- Here you will see two options. If you click on the top ORCiD button will send you out to the ORCiD authentication page, and after correctly entering your ORCiD info, send you back to Dash.

- Although you have now successfully authenticated with ORCiD, to ensure you are connected to your correct submitting instance (a campus, a department, DataONE, etc…) you will be asked to choose your Single Sign-On. This is the only time you will be asked to login twice.

- After successfully logging in with Single Sign-On you will have your account connected to your ORCiD. In the future, you will not need to repeat this process and instead you will either be able to save your login to your browser or choose one of the two options for logging in.If you have already submitted to Dash before, you may logout, and go through the same steps above. This process will tie your ORCiD to your existing account and allow for either ORCiD or Single Sign-On in the future.
Dash: The Data Publication Tool for Researchers
This post has been crossposted on Medium
We all know that research data should be archived and shared. That’s why Dash was created, a Data Publishing platform free to UC researchers. Dash complies with journal and funder requirements, follows best practices, and is easy to use. In addition, new features are continuously being developed to better integrate with your research workflow.
Why is Dash the best solution for UC researchers:
- Data are archived indefinitely. You can use Dash to ensure all of your research data will be available even after you get a new computer or switch institutions. Beyond that, your data will have all the important associated documentation on the funding sources for the research, the research methods and equipment used, and readme files on how your data was processed so future researchers from your own lab or globally can utilize your work.
- Data can be published at any time. While we do have features that assist with affiliated article publication like keeping your data private during the review process, Data Publications do not need to be associated with an article. Publish out your data at any point in time.
- Data can be versioned. As you update and optimize protocols, or do further analysis on your data, you may update your data files or documentation. Your DOI will always resolve to a landing page listing all versions of the dataset.
- Data can be uploaded to Dash directly from your computer or through a “manifest”. “Manifest” means you may enter up to 1000 URLs where your data are living on servers, box, dropbox, or google drive and the data will be transferred to Dash without waiting several hours or dealing with timeouts.
- You can upload up to 100gb of data per submission.
- Dash does not limit file type. So long as the data are within the size limits listed above, publications can be image data, tabular data, qualitative data, etc…
- Related works can be linked. Code, articles, other datasets, and protocols can be linked to your data for a more comprehensive package of your research.
- Data deposited to Dash receive a DOI. This means that not only can your data be located but you can cite your data as you would articles. The landing page for each dataset includes an author list for your citation as well, so each author who contributed to the data collection and analysis may receive credit for their work.
- Data are assigned an open license. Data deposited are publicly available for re-use to anyone using a Creative Commons license. You put many hours and coffees into producing these data, public release will give your research a broader reach. A light reminder that your name are still associated with data and making your data public does not mean you are “giving away” your work.
- Dash is a UC project. Dash can be customized per campus. Many campus libraries are subsidizing the cost of storage, and it is developed by University of California Curation Center (UC3) meaning this service is set-up to serve your needs.
We hear a lot about the cost of storage being an inhibitor. But, on many campuses, the storage costs associated with Dash are subsidized by academic libraries or departments. The cost of storage could also be written into grants (as funders do require data to be archived).
We are always looking for feedback on what features would be the most useful, so that we can make data publishing a part of your normal workflows. Get in touch with us or start using Dash to archive and share your data.