Skip to main content

(index page)

Themes for Digital Preservation

By: Eric Lopatin

Since our inception, a main focus of UC3 has been to deliver high quality, reliable digital preservation services for the UC community. Currently, this takes the form of both consultative and community engagement work, as well as technical development to ensure our digital preservation repository, Merritt, remains durable and innovative.

For 2021, our digital preservation goals fit into three key themes: Community Engagement, Simplification, and Scaling. By working with these themes in mind, our goal is to promote the values at the core of many preservation systems and programs – values such as reliability, authenticity, integrity and sustainability. The crossroads of technology and policy are where these values play out, and we’re aiming to keep abreast of many of them while heading into 2021.

Community Engagement 

Metrics and insight – Last year, our team laid the groundwork for more granular reporting on content held in Merritt. This work has already allowed us to provide reports for campuses that illustrate a variety of aspects related to their collections. This year, we’ll work on dashboards and data visualizations to provide for more insight to users into their collections.

UC system-wide digital preservation – Over the past two years, UC3 has participated in multiple phases of a system-wide Digital Preservation Strategy working group. The next phase of this effort will establish a systemwide leadership group and begin to construct a digital preservation training program across UC campuses. In 2021, our team will continue to participate in these efforts as they set the course for future projects across our campus community.

NDSA engagement – The National Digital Stewardship Alliance is a longstanding, community building organization which promotes discussion, learning and standards surrounding digital preservation. This year I’ll be co-chairing the NDSA Infrastructure Interest group with Leah Prescott at Georgetown University. We’ll be facilitating conversations surrounding preservation technologies and infrastructure, while also joining NDSA Leadership meetings to help apply input from interest group participants directly to activities the organization takes on throughout the year. Given the record attendance of NDSA’s recent DigiPres 2020 conference, I’m looking forward to helping build out future opportunities through which the larger preservation community can collaborate.

Simplification

Preservation Assurance – The overarching UC3 digital preservation strategy calls for creating and maintaining three copies of every object in Merritt, across two geographic regions with differing disaster threats, with at least one of those copies being less volatile in nearline storage. All Merritt collection content now adheres to this strategy, and we replicate new submissions to our cloud storage providers as it arrives.

New submissions – One of the most commonly used methods of adding new content to Merritt is uploading a manifest file that enables batch ingest of hundreds or thousands of objects that are pulled from a user’s on-prem storage. In 2021, we’re planning to simplify and automate the manifest creation process to assist users with this task, so they can be assured that all of their objects and object-level metadata will be handled correctly.

Common API – One API for use across the Merritt system has been a goal for quite a while, and we’re looking forward to making it a reality. In 2021, we will continue our work to design a common API for use by users and Merritt microservices alike. This will allow for submitting content, gaining insight into existing submissions, easier external systems integration, and of course access to individual files and object versions. 

Scaling

Auto-scaling – The theme of Simplification goes hand-in-hand with Scaling. In our case, effectively scaling aspects of the Merritt system could be more aptly referred to as auto-scaling. In a recent blog post, we discussed how the team has been at work implementing a centralized parameter store with AWS Systems Manager to streamline Merritt microservice configuration. 

Resilience – In 2021, our work will include simplifying the process of adding new hosts when needed (during periods of increased load on the system). Eventually our goal is to reach the point where this can happen without human intervention. And on the flip side, spinning down hosts when they are not needed will occur as well. Auto-scaling microservices in this sense promises to make the overall Merritt system more resilient, secure and cost effective.

In summary, 2021 promises to be a busy year for our digital preservation team at CDL.  As always, feel free to contact me with any questions. I am happy to discuss any of these ideas and directions for 2021, along with others you may have in mind!

This blog is a part of the “A Peek Into 2021 for UC3″ series.

Research Data Publishing Redefined

By: Daniella Lowenberg 

In considering the title for this post, I struggled to narrow down the range of activities that I work on to a specific name and have landed on research data publishing (which is timely now four years after my first blog for UC3 defining data publishing). The reason this is difficult is that while data publishing may be a relatively new topic in the last decade, it is often tied specifically to depositing data in a repository. But the activities that I believe define data publishing are: 

In short: it’s not one component (e.g., managing a repository). Specializing in data publishing means focusing on and leading the development of communities of practice around open standards, open infrastructure, and easily understood and accessible workflows for researchers. 

So, what’s the vision for 2021? Well, adoption of course. But in the following areas:

Seamless Connections 

A big area that we’ve invested in is our partnership with Dryad. When considering adoption for Dryad, we are referring to both the researcher and research supporting communities. Seamlessly connecting researchers and research supporters is essential for our globally shared goals to make open data a more commonly accepted and well done practice. A lot of this requires education (shout out to UC’s Love Data Week) but as Product Manager for Dryad I am also thinking through and prioritizing how services and workflows can be educational and as easy as possible. Our iterative product strategy is focused on collaborations and integrations that will make data publishing more seamless and better connected to other components of the research process. We are looking forward to launching our integrations with Zenodo for software, eJournalPress and Editorial Manager for journals, and Frictionless Data for increased quality of our datasets. In terms of the research supporting communities, we are working to build better connections between the growing funders, publishers, and institutions that have long supported or recently joined the Dryad community. 

Open Data Metrics

It’s become increasingly clear that we need a way to evaluate the reach and re-use of openly published research data. The Make Data Count initiative is continuing to build the social and technical infrastructure for open data metrics. Beginning this year, a team of bibliometricians at University of Ottawa and ZBW are initiating qualitative and quantitative studies on data citation and re-use across various scientific disciplines that will influence the development of proper indicators. We are also beginning to map services that will be broadly accessible for repositories to standardize and aggregate data usage and citation. 

Daniella’s book club recommendation —  Open Data Metrics: Lighting the Fire.

Research Data Publishing Ethics

Coming from the journal publishing side of open research, I joined the UC3 team wondering about publication ethics and how to best position our data publishing initiatives, sensitive to the various issues that can arise with research publication. Spearheading a new FORCE11 Working Group, we are proud to launch the Research Data Publishing Ethics WG. This group, that informally began a year ago, with folks from agencies, repositories, publishers, preprint advocates, and research integrity experts, will develop community norms and proposed workflows for data publishers to consider (e.g., publishing identifying information, considering legal standards across countries, authorship disputes). Please join if you have interest in contributing to the development of these standards or would like to follow the conversations! 

While these are a few highlighted areas that we will be focused on in the new year, we are always interested in collaborating or generating new ideas around research data publishing and how to best support researchers in the advancement of their discoveries. If we have learned anything in COVID times, we know that this space is essential. 

This blog is a part of the “A Peek Into 2021 for UC3″ series

 

A Peek Into 2021 for UC3

By: The UC3 Team

Across the UC3 team, we specialize in topics in digital curation, digital preservation, and open research. We manage a range of services, leading and participating in initiatives to move these topic areas forward.  

We’ve found that while some collaborators and colleagues may already be familiar with what we do, the full portfolio of our projects and communities may not always be well understood. Reflecting on the breadth of our current work as well as where we are headed, the UC3 team is kicking off this year with a series of posts about the areas of digital curation that we are involved in: 

UC3 has been around since 2009 and the team is ever changing. Considering how our work evolves, both leading and joining efforts in open research practices, our goal for the series is to re-familiarize folks with what areas we are invested in and share more about the thinking and strategy behind where we are headed in 2021.