(index page)
A new opportunity to build a better (data) future
Last month I left my comfort zone.
After 30 years of working as an engineer, developer, and technical leader at Scripps Institution of Oceanography (SIO at UC San Diego), I started a new career as a Senior Product Manager and Research Data Specialist with UC Curation Center (UC3) at the California Digital Library. While it may sound like a big change, it was more of steady evolution.
Although my projects at SIO were initially focused on scientific instrumentation, software development, and engineering specifications, I found the curation of the in situ data to be fascinating and better aligned with my skills and preferences. This led to service opportunities which included leadership positions within national and international data initiatives, and those projects allowed me to collaborate with members of UC3.
Joining their team was the next logical step.
The transition from being part of the technical staff in a research setting to being a hands-on data advocate in UC3 has been an invigorating challenge so far, and it provides an excellent opportunity to build on my foundation of knowledge and grow in new areas.
It’s an honor to pick up where my predecessor, Daniella Lowenberg, left off. I’ve long admired her approach to all things data. I am grateful for the extraordinary measures that she and John Chodacki have taken to bring me up to speed as soon as possible.
Data publishing is a dynamic young field and my colleagues and I will be able to help shape the conversations, initiatives, and tools that serve the international research community. I look forward to working with my new colleagues as we advocate for open data and help build and implement infrastructure to make data more discoverable, interoperable, and reusable.
csv,conf,v5 moves online
csv,conf is a non-profit community conference run by folks who really love data and sharing knowledge. The first two years, organizers established the event’s scope and community in Berlin, Germany. The third and fourth year, the organizers moved the event to Portland, Oregon. And, starting this year, we hoped to move the event to Washington, DC and host csv,conf,v4 at the University of California Center in the nation’s capital. However, with the ongoing pandemic, we have moved the conference online.
Check out the csv,conf,v5 schedule at https://csvconf.com/speakers/
On May 13-14, 2020, the fifth version of csv,conf will be held virtually. Over two days, attendees will have the opportunity to hear about ongoing work, share skills, exchange ideas and kickstart collaborations. You are welcome to attend, but you must register by the end of day on May 12.
Register for csv,conf,v5 at https://csvconfv5.eventbrite.com
What is csv,conf?
Over the past several years, UC3 has worked with partners at The Carpentries, Open Knowledge International, DataCite, rOpenSci, and Code for Science and Society to organize csv,conf (https://csvconf.com). For those that aren’t familiar with the concept, csv,conf brings diverse groups together to discuss data topics, and features stories about data sharing and data analysis from science, journalism, government, and open source.
Although a ubiquitous term, the acronym CSV has varied meanings depending on who you ask. In the data space, CSV often translates to comma-separated values – a machine-readable data format used to store tabular data in plain text. To many, the format represents simplicity, interoperability, compactness, hackability, among other things.
From when it first launched in July 2014 as a conference for data makers everywhere, csv,conf adopted the comma-separated-values format in its branding metaphorically. Needless to say, as a data conference that brings together people from different disciplines and domains, conversations and anecdotes shared at csv,conf are not limited to the CSV file format.
Check out past conference sessions on our YouTube channel.
Join us online
Make sure to check out the csv,conf,v5 schedule at https://csvconf.com/speakers/ and register for csv,conf,v5 at https://csvconfv5.eventbrite.com
The UC3 team is excited to be part of the conference committee and happy to answer any questions you may have. Feel free to reach out to us at uc3@ucop.edu or to the full committee at csv-conf-coord@googlegroups.com.
OA Week 2017: Maximizing the value of research
By John Borghi and Daniella Lowenberg
Happy Friday! This week we’ve defined open data, discussed some notable anecdotes, outlined publisher and funder requirements, and described how open data helps ensure reproducibility. To cap off open access week, let’s talk about one of the principal benefits of open data- it helps to maximize the value of research.

Research is expensive. There are different ways to break it down but, in the United States alone, billions of dollars are spent funding research and development every year. Much of this funding is distributed by federal agencies like the National Institutes of Health (NIH) and the National Science Foundation (NSF), meaning that taxpayer dollars are directly invested in the research process. The budgets of these agencies are under pressure from a variety of sources, meaning that there is increasing pressure on researchers to do more with less. Even if budgets weren’t stagnating, researchers would be obligated to ensure that taxpayer dollars aren’t wasted.
The economic return on investment for federally funded basic research may not be evident for decades and overemphasizing certain outcomes can lead to the issues discussed in yesterday’s post. But making data open doesn’t just refer to giving access other researchers, it also means giving taxpayers access to the research they paid for. Open data also enables reuse and recombination, meaning that a single financial investment can actually fund any number of projects and discoveries.
Research is time consuming. In addition to funding dollars, the cost of research can be measured in the hours it takes to collect, organize, analyse, document, and share data. “The time it takes” is one of the primary reasons cited when researchers are asked why they do not make their data open. However, while certainly takes time to ensure open data is organized and documented in such a way as to enable its use by others, making data open can actually save researchers time over the long run. For example, one consequence of the file drawer problem discussed yesterday is that researchers may inadvertently redo work already completed, but not published, by others. Making data open helps prevents this kind of duplication, which saves time and grant funding. However, the beneficiaries of open data aren’t just for other researchers- the organization and documentation involved in making data open can help researchers from having to redo their own work as well.
Research is expensive and time consuming for more than just researchers. One of the key principles for research involving human participants is beneficence– maximizing possible benefits while minimizing possible risks. Providing access to data by responsibly making it open increases the chances that researchers will be able to use it to make discoveries that result in significant benefits. Said another way, open data ensures that the time and effort graciously contributed by human research participants helps advance knowledge in as many ways as possible.
Making data open is not always easy. Organization and documentation take time. De-identifying sensitive data so that it can be made open responsibly can be less than straightforward. Understanding why doesn’t automatically translate into knowing how. But we hope this week we’ve given you some insight into the advantages of open data, both for individual researchers and for everyone that engages, publishes, pays for, and participates in the research process.
OA Week 2017: Policies, Resources, & Guidance
By John Borghi and Daniella Lowenberg
Yesterday, through quotes and anecdotes, we outlined reasons why researchers should consider making their data open. We’ll dive deeper into some of these reasons tomorrow and on Friday, but today we’re focused on mandates.
Increasingly funding agencies and scholarly publishers are mandating that researchers open up their data. Different agencies and publishers have different policies so, if you are a researcher, it can be difficult to understand exactly what you need to do and how you should go about doing it. To help, we’ve compiled a list of links and resources.

Funder Policy Guidance:
The links below outline US federal funding policies as well as non profit and private funder policies. We also recommend getting in touch with your Research Development & Grants office if you have any questions about how the policy may apply to your grant funded research.
US Federal Agency Policies:
http://datasharing.sparcopen.org/data
http://www.library.cmu.edu/datapub/sc/publicaccess/policies/usgovfunders
Global & Private Funder Policies:
https://www.gatesfoundation.org/How-We-Work/General-Information/Open-Access-Policy
https://wellcome.ac.uk/funding/managing-grant/policy-data-software-materials-management-and-sharing
Publisher Policy Guidance:
Below are a list of publishers that oversee thousands of the world’s journals and their applicable data policies. If you have questions about how to comply with these policies we recommend getting in touch with the journal you are aiming to submit to during the research process or before submission to expedite peer review and comply with journal requirements. It is also important to note that if the journal you are submitting to requires data to be publicly available this means that the data underlying the results and conclusions of the manuscript must be submitted, not necessarily the entire study. These data are typically the values behind statistics, data extracted from images, qualitative excerpts, and data necessary to replicate the conclusions.
PLOS: http://journals.plos.org/plosone/s/data-availability
Elsevier: https://www.elsevier.com/about/our-business/policies/research-data#Policy
Springer-Nature: https://www.springernature.com/gp/authors/research-data-policy/springer-nature-journals-data-policy-type/12327134
PNAS: http://www.pnas.org/site/authors/editorialpolicies.xhtml#xi
Resources, Services, and Tools (The How)
Thinking about and preparing your data for publication and free access requires planning before and during the research process. Check out the free Data Management Plan (DMP) Tool: www.dmptool.org
For researchers at participating UC campuses, earth science and ecology (DataONE), and researchers submitting to the UC Press journals Elementa and Collabra, check out Dash, a data publishing platform: dash.ucop.edu
We also recommend checking out www.re3data.org and https://fairsharing.org for standards in your field and repositories both in your field or generally that will help you meet funder and publisher requirements and make your data open.
If you are a UC researcher, click on the name of your campus below for library resources to support researchers with managing, archiving, and sharing research data

Welcome to OA Week 2017!
By John Borghi and Daniella Lowenberg
It’s Open Access week and that means it’s time to spotlight and explore Open Data as an essential component to liberating and advancing research.

Let’s Celebrate!
Who: Everyone. Everyone benefits from open research. Researchers opening up their data provides access to the people who paid for it (including taxpayers!), patients, policy makers, and other researchers who may build upon it and use it to expedite discoveries.
What: Making data open means making it available for others to use and examine as they see fit. Open data is about more than just making the data available on its own, it is also about opening up the tools, materials, and documentation that describes how the data were collected and analyzed and why decisions about the data were made.
When: Data can be made open anytime a paper is published, anytime null or negative results are found, anytime data are curated. All the open data, all the time.
Where: If you are a UC researcher, resources free to you are available at each of your campuses Research Data Management library websites. Dash is a data publication platform to make your data open and archived for participating UC campuses, UC Press, and DataONE’s ONEShare. For more open data resources, check out our upcoming post on Wednesday, October 25th.
Why: Data are what support conclusions, discoveries, cures, and policies. Opening up articles for free access to the world is very important, but the articles are only so valuable without the data that went into them.
Follow this week as we cover policies, user stories, resources, economics, and justifications for why researchers should all be making their (de-identified, IRB approved) data freely available.
Tweet to us @UC3CDL with any questions, comments, or contributions you may have.
Upcoming Posts
Tuesday, October 24th: Open Data in Order to… Stories & Testimonials
Wednesday, October 25th: Policies, Resources, & Guidance on How to Make Your Data Open
Thursday, October 26th: Open Data and Reproducibility
Friday, October 27th: Open Data and Maximizing the Value of Research
Building a Community: Three months of Library Carpentry.
Back in May, almost 30 librarians, researchers, and faculty members got together in Portland Oregon to learn how to teach lessons from Software, Data, and Library Carpentry. After spending two days learning the ins and outs of Carpentry pedagogy and live coding, we all returned to our home institutions, as part of the burgeoning Library Carpentry community.
Library Carpentry didn’t begin in Portland, of course. It began in 2014 when the community began developing a group of lessons at the British Library. Since then, dozens of Library Carpentry workshops have been held across four continents. But the Portland event, hosted by California Digital Library, was the first Library Carpentry-themed instructor training session. Attendees not only joined the Library Carpentry community, but took their first step in getting certified as Software and Data Carpentry instructors. If Library Carpentry was born in London, it went through a massive growth spurt in Portland.
Together, the carpentries are a global movement focused on teaching people computing skills like navigating the Unix Shell, doing version control with Git, and programming with Python. While Software and Data Carpentry are focused on researchers, Library Carpentry is by and for Librarians. Library Carpentry lessons include an introduction to data for librarians, Open Refine, and many more. Many attendees of the Portland instructor training contributed to these lessons during the Mozilla Global Sprint in June. After more than 850 Github events (pull requests, forks, issues, etc), Library Carpentry ended up as far and away the most active part of the global sprint. We even had a five month old get in on the act!
Since the instructor training and the subsequent sprint, a number of Portland attendees have completed their instructor certification. We are on track to have 10 certified instructors in the UC system alone. Congratulations, everyone!
Great talks and fun at csv,conf,v3 and Carpentry Training
Day1 @CSVConference! This is the coolest conf I ever been to #csvconf pic.twitter.com/ao3poXMn81 — Yasmina Anwar (@yasmina_anwar) May 2, 2017 On May 2 – 5 2017, I (Yasmin AlNoamany) was thrilled to attend the csv,conf,v3 2017 conference and the Software/Data Carpentry instructor training in Portland, Oregon, USA. It was a unique experience to attend and speak with many … Continue reading →
Source: Great talks and fun at csv,conf,v3 and Carpentry Training