(index page)

Our Path to FAIR Station

At UC3, our work focuses on how research activities, outputs, and systems connect across the lifecycle, from planning and data collection through to publication and reuse. We approach these challenges from a research infrastructure and information management perspective, which naturally extends upstream to where research begins, including field stations and other place-based environments. As we embark on the FAIR Station project, we wanted to reflect on some of the many projects and work that got us to this point.

Early foundations

UC3 was founded as CDL’s digital curation program to focus on supporting the full research lifecycle. For more than 15 years, our work has centered on enabling connections from planning and data collection. One of our most formative collaborations was with the DataONE community. That vision was ambitious: to support discovery, access, and reuse of environmental data across a distributed landscape, grounded in the realities of field-based, place-dependent research.

DataONE’s emphasis on lifecycle coordination and distributed data collection reinforced the importance of capturing context at the point of origin. These ideas, grounded in foundational work by UC3 and supported through NSF investment, continue to inform our work. At the same time, UC3’s other collaborations explored various entry points into the research lifecycle.

As a founding member of the DataCite community, we helped lead the adoption of DOIs for research data, strengthening how outputs are identified and connected. The DMPTool supports planning and structuring, while repository services enable publication and citation. In parallel, early work on field station identifiers, including pilots and community presentations, surfaced the importance of treating place as a first-class entity rather than background context.

Engagement with communities such as RDA, ESIP and OBFS further highlighted the challenges of connecting data, samples, and locations across distributed systems. This work established a strong foundation, but also exposed a critical gap: research context is not consistently captured or linked across the lifecycle.

Making infrastructure actionable

Over the past 10 years, a series of funded projects and community initiatives focused on strengthening the connective tissue of the research ecosystem. This included:

Machine-actionable data management plans (NSF EAGER, 2017)
Elevating data as a first-class research output (Sloan Foundation, 2017)
Open data metrics and citation practices through Make Data Count (NSF EAGER, 2014)
Sustainable model for the Research Organization Registry (ROR) (IMLS, 2020)
Better research data management through vertical interoperability (NSF Conference, 2024)

During this period, UC3 also contributed to building shared identifier infrastructure, including the Research Organization Registry (ROR) (NSF EAGER, 2020) and broader PID strategy efforts. This work built directly on earlier exploration with the OBFS and NAML on field station identifiers, where efforts to identify place-based environments shaped our thinking about organizational identity, disambiguation, and persistent identifiers. This work established the ability to link research entities, while also making clear that identifiers alone are not sufficient without being embedded into real workflows.

During this same time, UC3 also worked closely with Dryad to evolve it into a platform for experimentation: collaborating on affiliation tracking, linking datasets to software and publications, and early integrations with data management planning workflows. These efforts moved infrastructure into practice, but also highlighted a limitation: much of our work remained downstream. The question of how to integrate these capabilities into place-based research environments remained.

FAIR Island: Infrastructure, Policy, and Practice

Over the years, UC3 has seen increasing alignment across our work in data management planning, persistent identifiers, and research outputs, and recognized an opportunity to bring these together in a place-based context. Through a partnership with UCNRS, and in particular with our frequent collaborators at the Gump South Pacific Research Station, we identified the field station reservation process as a strategic point of engagement where these elements could be introduced earlier in the research lifecycle.

UC3 made the case internally to pursue this work, leading to initial UC investment in what became the FAIR Island project. The core idea was to take the data policies and expectations typically expressed in grant applications and DMPs, and embed them directly into field station workflows. By working to integrate the DMP Tool with the UCNRS Reservation Application Management System (RAMS), we began exploring how policy, identifiers, and metadata could be introduced at the point where researchers request access to field sites.

This work marked the beginning of a deeper collaboration across UCOP, bringing together research infrastructure and operational systems that support day-to-day field station activities. With subsequent NSF support, FAIR Island expanded to include additional partners, including Metadata Game Changers and the Tetiaroa Society, allowing us to test more complete, end-to-end workflows across planning, data collection, and downstream integration.

Our work on FAIR Island demonstrated the importance of the reservation process. It represents a moment where researchers are already providing structured information, making it a natural and effective place to introduce data policies, identifiers, and expectations that can carry forward through the rest of the research lifecycle. This began a shift from connecting data after the fact to embedding those connections where research begins.

FAIR Samples and vertical interoperability

Building on this foundation, our recent projects have expanded into data collection workflows and cross-system integration. The FAIR Samples project (NSF EAGER, 2024) focuses on improving how physical samples are identified, described, and connected across workflows. A key focus is integrating sample management systems with the broader research ecosystem. This work builds on the concept of vertical interoperability, developed by our partners at RSpace, which focuses on how information moves across layers of the research process, from planning tools and field data collection to lab systems and repositories.

Rather than introducing new systems, the FAIR Samples approach emphasizes connecting existing infrastructure, including IGSNs for samples, tools like FieldMark for structured field data capture, and platforms like RSpace for managing and linking workflows. Together, these integrations demonstrate how coordinated tools can support end-to-end workflows without requiring entirely new systems.

FAIR Station: Bringing it all together

This brings us to FAIR Station. This project is not a new direction, but a continuation of our broader efforts to connect place, samples, and data across the research lifecycle.

Collaboration with UCNRS and work with its RAMS platform has been central to our efforts. What began as an operational system that was not easily extended has, through sustained UC investment, evolved into a platform that is now much better positioned for integration. That shift creates a new phase of opportunity for UC3 and UCNRS to work together on connecting planning, policy, identifiers, and downstream systems in ways that were not previously feasible.

It also opens the door to thinking beyond UC. As RAMS continues to mature, we began exploring how it can be extended and open sourced, creating a foundation that other field stations and research networks can adopt, adapt, and contribute to. The goal is not just to support a single system, but to help enable a broader platform where the global field station community can see themselves and participate. With funding from the Moore Foundation, we can now bring together this work and these partnerships to explore how field station systems can support more connected, interoperable research workflows at scale.

Looking ahead

Across UC3’s projects, there is a consistent way of working: connecting existing efforts, aligning with community practices, and building on infrastructure already in use rather than creating new systems in isolation. All of UC3’s work has only been possible through sustained support from funders and collaboration with communities and partners. That same model is essential for FAIR Station. We will continue to work with field stations, infrastructure providers, and partners like RSpace to extend systems like RAMS and support open, interoperable workflows.

Support your Data

Building an RDM Maturity Model: Part 4

By John Borghi

Researchers are faced with rapidly evolving expectations about how they should manage and share their data, code, and other research products. These expectations come from a variety of sources, including funding agencies and academic publishers. As part of our effort to help researchers meet these expectations, the UC3 team spent much of last year investigating current practices. We studied how neuroimaging researchers handle their data, examined how researchers use, share, and value software, and conducted interviews and focus groups with researchers across the UC system. All of this has reaffirmed our that perception that researchers and other data stakeholders often think and talk about data in very different ways.

Such differences are central to another project, which we’ve referred to alternately as an RDM maturity model and an RDM guide for researchers. Since its inception, the goal of this project has been to give researchers tools to self assess their data-related practices and access the skills and experience of data service providers within their institutional libraries. Drawing upon tools with convergent aims, including maturity-based frameworks and visualizations like the research data lifecycle, we’ve worked to ensure that our tools are user friendly, free of jargon, and adaptable enough to meet the needs of a range of stakeholders, including different research, service provider, and institutional communities. To this end, we’ve renamed this project yet again to “Support your Data”.

Image showing some of the support structure for the Golden Gate Bridge. This image also nicely encapsulates how many of the practices described in our tools are essential to the research process but are often invisible from view.

What’s in a name?

Because our tools are intended to be accessible to a people with a broad range of perceptions, practices, and priorities, coming up with a name that encompasses complex concepts like “openness” and “reproducibility” proved to be quite difficult. We also wanted to capture the spirit of terms like “capability maturity” and “research data management (RDM)” without referencing them directly. After spending a lot of time trying to come up with something clever, we decided that the name of our tools should describe their function. Since the goal is to support researchers as they manage and share data (in ways potentially influenced by expectations related to openness and reproducibility), why not just use that?

Recent Developments

In addition to thinking through the name, we’ve also refined the content of our tools. The central element, a rubric that allows researchers to quickly benchmark their data-related practices, is shown below. As before, it highlights how the management of research data is an active and iterative process that occurs throughout the different phases of a project. Activities in different phases represented in different rows. Proceeding left to right, a series of declarative statements describe specific activities within each phase in order of how well they are designed to foster access to and use of data in the future.

The “Support your Data” rubric. Each row is complemented by a one page guide intended to help researchers advance their data-related practices.

The four levels “ad hoc”, “one-time”, “active and informative” and “optimized for re-use”, are intended to be descriptive rather than prescriptive.

Ad hoc — Refers to circumstances in which practices are neither standardized or documented. Every time a researcher has to manage their data they have to design new practices and procedures from scratch.
One time — Refers to circumstances in which data management occurs only when it is necessary, such as in direct response to a mandate from a funder or publisher. Practices or procedures implemented at one phase of a project are not designed with later phases in mind.
Active and informative — Refers to circumstances in which data management is a regular part of the research process. Practices and procedures are standardized, well documented, and well integrated with those implemented at other phases.
Optimized for re-use — Refers to circumstances in which data management activities are designed to facilitate the re-use of data in the future

Each row of the rubric is tied to a one page guide that provides specific information about how to advance practices as desired or required. Development of the content of the guides has proceeded sequentially. During the autumn and winter of 2017, members of the UC3 team met to discuss issues relevant to each phase, reduce the use of jargon, and identify how content could be localized to meet the needs of different research and institutional communities. We are currently working on revising the content based suggestions made during these meetings.

Next Steps

Now that we have scoped out the content, we’ve begun to focus on the design aspect of our tools. Working with CDL’s UX team, we’ve begun to think through the presentation of both the rubric and the guides in physical media and online.

As always, we welcome any and all feedback about content and application of our tools.

OA Week 2017: Maximizing the value of research

By John Borghi and Daniella Lowenberg

Happy Friday! This week we’ve defined open data, discussed some notable anecdotes, outlined publisher and funder requirements, and described how open data helps ensure reproducibility. To cap off open access week, let’s talk about one of the principal benefits of open data- it helps to maximize the value of research.

Research is expensive. There are different ways to break it down but, in the United States alone, billions of dollars are spent funding research and development every year. Much of this funding is distributed by federal agencies like the National Institutes of Health (NIH) and the National Science Foundation (NSF), meaning that taxpayer dollars are directly invested in the research process. The budgets of these agencies are under pressure from a variety of sources, meaning that there is increasing pressure on researchers to do more with less. Even if budgets weren’t stagnating, researchers would be obligated to ensure that taxpayer dollars aren’t wasted.

The economic return on investment for federally funded basic research may not be evident for decades and overemphasizing certain outcomes can lead to the issues discussed in yesterday’s post. But making data open doesn’t just refer to giving access other researchers, it also means giving taxpayers access to the research they paid for. Open data also enables reuse and recombination, meaning that a single financial investment can actually fund any number of projects and discoveries.

Research is time consuming. In addition to funding dollars, the cost of research can be measured in the hours it takes to collect, organize, analyse, document, and share data. “The time it takes” is one of the primary reasons cited when researchers are asked why they do not make their data open. However, while certainly takes time to ensure open data is organized and documented in such a way as to enable its use by others, making data open can actually save researchers time over the long run. For example, one consequence of the file drawer problem discussed yesterday is that researchers may inadvertently redo work already completed, but not published, by others. Making data open helps prevents this kind of duplication, which saves time and grant funding. However, the beneficiaries of open data aren’t just for other researchers- the organization and documentation involved in making data open can help researchers from having to redo their own work as well.

Research is expensive and time consuming for more than just researchers. One of the key principles for research involving human participants is beneficence– maximizing possible benefits while minimizing possible risks. Providing access to data by responsibly making it open increases the chances that researchers will be able to use it to make discoveries that result in significant benefits. Said another way, open data ensures that the time and effort graciously contributed by human research participants helps advance knowledge in as many ways as possible.

Making data open is not always easy. Organization and documentation take time. De-identifying sensitive data so that it can be made open responsibly can be less than straightforward. Understanding why doesn’t automatically translate into knowing how. But we hope this week we’ve given you some insight into the advantages of open data, both for individual researchers and for everyone that engages, publishes, pays for, and participates in the research process.

Welcome to OA Week 2017!

By John Borghi and Daniella Lowenberg

It’s Open Access week and that means it’s time to spotlight and explore Open Data as an essential component to liberating and advancing research.

Let’s Celebrate!

Who: Everyone. Everyone benefits from open research. Researchers opening up their data provides access to the people who paid for it (including taxpayers!), patients, policy makers, and other researchers who may build upon it and use it to expedite discoveries.

What: Making data open means making it available for others to use and examine as they see fit. Open data is about more than just making the data available on its own, it is also about opening up the tools, materials, and documentation that describes how the data were collected and analyzed and why decisions about the data were made.

When: Data can be made open anytime a paper is published, anytime null or negative results are found, anytime data are curated. All the open data, all the time.

Where: If you are a UC researcher, resources free to you are available at each of your campuses Research Data Management library websites. Dash is a data publication platform to make your data open and archived for participating UC campuses, UC Press, and DataONE’s ONEShare. For more open data resources, check out our upcoming post on Wednesday, October 25th.

Why: Data are what support conclusions, discoveries, cures, and policies. Opening up articles for free access to the world is very important, but the articles are only so valuable without the data that went into them.

Follow this week as we cover policies, user stories, resources, economics, and justifications for why researchers should all be making their (de-identified, IRB approved) data freely available.

Tweet to us @UC3CDL with any questions, comments, or contributions you may have.

Upcoming Posts

Tuesday, October 24th: Open Data in Order to… Stories & Testimonials

Wednesday, October 25th: Policies, Resources, & Guidance on How to Make Your Data Open

Thursday, October 26th: Open Data and Reproducibility

Friday, October 27th: Open Data and Maximizing the Value of Research

Great talks and fun at csv,conf,v3 and Carpentry Training

Day1 @CSVConference! This is the coolest conf I ever been to #csvconf pic.twitter.com/ao3poXMn81 — Yasmina Anwar (@yasmina_anwar) May 2, 2017 On May 2 – 5 2017, I (Yasmin AlNoamany) was thrilled to attend the csv,conf,v3 2017 conference and the Software/Data Carpentry instructor training in Portland, Oregon, USA. It was a unique experience to attend and speak with many … Continue reading →

Source: Great talks and fun at csv,conf,v3 and Carpentry Training

The Science of the DeepSea Challenge

Recently the film director and National Geographic explorer-in-residence James Cameron descended to the deepest spot on Earth: the Challenger Deep in the Mariana Trench. He partnered with lots of sponsors, including National Geographic and Rolex, to make this amazing trip happen. A lot of folks outside of the scientific community might not realize this, but until this week, there had been only one successful descent to this the trench by a human-occupied vehicle (that’s a submarine for you non-oceanographers). You can read more about that 1960 exploration here and here.

I could go on about how astounding it is that we know more about the moon than the bottom of the ocean, or discuss the seemingly intolerable physical conditions found at those depths– most prominently the extremely high pressure. However what I immediately thought when reading the first few articles about this expedition was where are the scientists?

h96804 — Before Cameron, Swiss Oceanographer Piccard and Navy officer Marsh went down in it to the virgin waters of the deep. From www.history.navy.mil/photos/sh-usn/usnsh-t/trste-b

After combing through many news stories, several National Geographic sites including the site for the expedition, and a few press releases, I discovered (to my relief) that there are plenty of scientists involved. The team that’s working with Cameron includes scientists from Scripps Institution of Oceanography (the primary scientific partner and long-time collaborator with Cameron), Jet Propulsion Lab, University of Hawaii, and University of Guam.

While I firmly believe that the success of this expedition will be a HUGE accomplishment for science in the United States, I wonder if we are sending the wrong message to aspiring scientists and youngsters in general. We are celebrating the celebrity film director involved in the project in lieu of the huge team of well-educated, interesting, and devoted scientists who are also responsible for this spectacular feat (I found less than 5 names of scientists in my internet hunt). Certainly Cameron deserves the bulk of the credit for enabling this descent, but I would like there to be a bit more emphasis on the scientists as well.

Better yet, how about emphasis on the science in general? It’s a too early for them to release any footage from the journey down, however I’m interested in how the samples will be/were collected, how they will be stored, what analyses will be done, whether there are experiments planned, and how the resulting scientific advances will be made just as public as Cameron’s trip was. The expedition site has plenty of information about the biology and geology of the trench, but it’s just background: there appears to be nothing about scientific methods or plans to ensure that this project will yield the maximum scientific advancement.

How does all of this relate to data and DCXL? I suppose this post falls in the category of data is important. The general public and many scientists hear the word “data” and glaze over. Data isn’t inherently interesting as a concept (except to a sick few of us). It needs just as much bolstering from big names and fancy websites as the deep sea does. After all, isn’t data exactly what this entire trip is about? Collecting data on the most remote corners of our planet? Making sure we document what we find so others can learn from it?

Here’s a roundup of some great reads about the Challenger expedition:

National Geographic: James Cameron Begins Descent to Ocean’s Deepest Point
National Geographic: Cameron’s dive cut short
National Geographic press release about Cameron’s trip to the bottom
National Geographic website for the project: Deepsea Challenge
The Guardian: James Cameron may kill the Kraken but not our journey of discovery
Spectacular post on Deep Sea News by Craig McCain about the value of this expedition for science and humanity
Scripps Institution of Oceanography information page about the Deep Sea Challenge
Stars and Stripes: Deep Sea Dive is Nothing New for the Navy
US Navy’s Press release for 1960 Trieste trip to the trench

Oceanographers: Why So Shy?

Last week I attended the TOS/ASLO/AGU Ocean Sciences 2012 Meeting in Salt Lake City. (If you are a DCXL blog regular, you know I was also at the Personal Digital Archiving 2012 Conference last week: my ears were bleeding by Friday night!). These two conferences were starkly different in many ways. Ocean Sciences had about 4,000 attendees, while PDA was closer to 100. Ocean Sciences had concurrent sessions, plenaries, and workshops, while PDA had only one room where all of the speakers presented. Although both provided provisions during breaks, PDA’s coffee and treats far surpassed those provided at the Salt Palace. But the most interesting difference? The incorporation of social media into the conference.

There are some amazing blogs out there for ocean scientists: Deep Sea News and SeaMonster come to mind immediately. There are also a plethora of active tweeters and bloggers in the ocean sciences community, including @labroides @jebyrnes (and his blog) @MiriamGoldste @RockyRohde @JohnFBruno @kzelnio @SFriedScientist @rejectedbanana @DrCraigMc @rmacpherson @Dr_Bik . I’m sure I’ve left some great ones out- feel free to tweet me and let me know! @carlystrasser).

That being said, ocean scientists stink at social media if OS 2012 was any indication.

First, the Ocean Sciences Meeting did not declare a hash tag – this is the first major conference I’ve been to in a while that didn’t do so. What does this mean? Those of us who were trying to communicate about OS 2012 via Twitter were not able to converge under a single hash tag until Tuesday (#oceans2012). Perhaps that isn’t such a big deal since there were only a dozen Tweeters at the conference. This is unusual for a conference of this size: at AGU 2011 in December, I would hazard to guess that there were more like 200 Tweeters. Food for thought.

Second, I heard from @MiriamGoldste that there was actual, audible clapping when disparaging comments were made about social media in one of the presentations. For shame, oceanographers! You should take advantage of tools offered to you; short of using social media yourself, you should recognize its growing importance in science (read some of the linked articles below).

Now for PDA 2012. A hash tag was declared (#pda12) and about 2 dozen active tweeters were off and running. We had dialogues during the conference, helped answer each others’ questions, commented on speakers’ major conclusions, and generally kept those that couldn’t attend the conference in person abreast of the goings-on. Combine that with real-time blogging of the meeting, and you had a recipe for being connected whether you were sitting in a pew at the Internet Archive or not. Links were tweeted to newly-posted slides, and generally there was a buzz about the conference.

So listen up, OS 2012 attendees: You are being left in the dust by other scientists who have embraced social media. I know what you are thinking: “I don’t have time to do all of that stuff!” One of the conference tweets says it best:

More information…

Read this great post from Scientific American on Social Media for Scientists

COMPASS: Communication partnership for science and the sea. I attended a COMPASS workshop two years ago at NCEAS and was swayed by the lovely Liz Neeley that social media was not only worth my time, but it could advance my career (read “Highly tweeted articles were 11x more likely to be cited” from The Atlantic).

Generally all of the resources on the Social Media For Scientists wikispace

Social Media for Scientists Recap from American Fisheries Society blog

As for how social media relates to the DCXL project, isn’t it obvious? I’ve been collecting feedback straight from potential DCXL users using social media. Because I have tapped into these networks, the DCXL project’s outcomes are likely to be useful for a large contingent of our target audience.

zach morris cell phone — It seems that oceanographers are stuck in the olden days of communication. For those keeping count, that’s TWO DCXL blog references to Zach Morris’ cell phone. From www.funnyordie.com

Academic Libraries: Under-Used & Under-Appreciated

I’m guilty. I often admit this when I meet librarians at conferences and workshops – I’m guilty of never using my librarians as a resource in my 13 years of higher ed, spread across seven academic institutions. At the very impressive MBL-WHOI Library in Woods Hole MA, there are quite a few friendly librarians that make their presence known to visitors. They certainly offered to help me, but it never occurred to me that they might be useful beyond telling me on what floor I can find the journal Limnology and Oceanography.

In hindsight, I didn’t know any better. Yes, we took the requisite library tour in grad school, and yes, I certainly used the libraries for research and access to books and journals, but no, I never talked to the librarians. Why is this? I have a few theories:

Librarians are terrible at self promotion. Every time I meet librarian, I’m awed and amazed by the vast quantities of knowledge they hold about all kinds of information. But most of the librarians I’ve encountered are unwilling to own up to their vast skill set. These humble folks assume scientists will come to them, completely underestimating the average academic’s stubbornness and propensity for self-sufficiency. In my opinion, librarians should stake out the popular coffee spot on campus and wear sandwich boards saying things like “You have no idea how to do research” or “Five minutes with me can change your <research> life“. Come on, librarians – toot your own horns!

Academics are trained to be self-sufficient. Every grad student has probably gotten the talk from their advisor at some point in their grad education. In my case the talk had phrases like these:

“You don’t have to ask me EVERY time you want to run down to the supply room”
“Which method do YOU think would work best?”
“How should I know how to dilute that acid? Go figure it out!”

It only takes a couple of brush-offs from your advisor before you realize that part of learning to be scientist involves solving problems all by yourself. This bodes well for future academic success, but does not allow us to entertain the idea that librarians might be helpful and save us oodles of time.

Google gives academics a false sense of security. Yes, I spend a lot of time Googling things. Many of this Googling occurs while having a drink with friends – some hotly debated item of trivia comes up, which requires that we pull out our smart phones to find out who’s right (it’s usually me). But Google can’t answer everything. Yes, it’s wonderful for figuring out who that actor in that movie was, or for showing a latecomer the amazing honey badger video. But Google is not necessarily the most efficient way to go about scholarly research. Librarians know this – they have entire schools dedicated to figuring out how to deal with information. The field of information science, which encompasses librarians, gives out graduate degrees in information. Do you really think that you know more about research than someone with a grad degree in information?? Extremely unlikely. Learn more about Information Science here.

Kingston Information and Library Service — Sterotype alert: there’s a lot of knowledge hiding behind librarians’ sensible shoes. From Flickr by Kingston Information & LIbrary Service

This post does, in fact, relate to the DCXL project. If you weren’t aware, the DCXL project is based out of California Digital Library. It turns out that librarians are quite good at being stewards of scholarly communication; who better to help us navigate the tricky world of digital data curation than librarians?

This post was inspired by a great blog posted yesterday from CogSci Librarian: How Librarians Can Help in Real Life, at #Sci013, and more