(index page)
UC3’s Development Culture
UC3 takes an iterative deployment approach based on Agile principles. Our development is driven by user-centered design by which product (service) managers gather the needs of our stakeholders and translate them into prioritized use cases. By placing the needs of stakeholders and users at the center of our analysis and design, we develop services and collections that are impactful, useful and used.
We implement agile principles by way of scrum practices, with features that are prioritized, scheduled to be developed and deployed within time-boxed sprints. These activities are carried out via the product manager and the development team via ceremonies. Ceremonies are held to iteratively plan our current sprint, define our backlog, and release “done” product increments. Furthermore, we are reflecting on the process to continuously improve how we do our work. Having ceremonies allows us to stay focused on the tasks at hand, minimize distractions and facilitate communication across the team.
UC3 aims for regularly scheduled deployments for releases, resulting in release processes that are routine and deliberate as opposed to crisis-driven. Releases based on minimum viable products allows us to fail fast — make quick decisions and reroute, when necessary. Releases are comprised of versioning, release notes and updated documentation for new features and support. Integration with other services within and external to CDL are designed utilizing best practices to ensure robust interoperability.
Our focus is on motivation, community, and trust as opposed to structure and control. We strive for aligned autonomy, to give teams the ability to solve problems on their own. Teams decide what to build, how to build it and work together while doing it.
We provide communication via our roadmaps, blogs, and wikis to stay in sync with each other both internally and externally. Check out our team here!
Neuroimaging as a case study in research data management: Part 2
Part 2: On practicing what we preach
A few weeks ago I described the results of a project investigating the data management practices of neuroimaging researchers. The main goal of this work is to help inform efforts to address rigor and reproducibility in both the brain imaging (neuroimaging) and academic library communities. But, as we were developing our materials, a second goal emerged- practice what we preach and actually apply the open science methods and tools we in the library community have been recommending to researchers
Wait, what? Open science methods and tools
Before jumping into my experience of using open science tools as part of a project that involves investigating open sciences practices, it’s probably worth taking a step back and defining what the term actually means. It turns out this isn’t exactly easy. Even researchers working in the same field understand and apply open science in different ways. To make things simpler for ourselves when developing our materials, we used “open science” broadly to refer to the application of methods and tools that make the processes and products of the research enterprise available for examination, evaluation, use, and re-purposing by others. This definition doesn’t address the (admittedly fuzzy) distinctions between related movements such open access, open data, open peer review, and open source, but we couldn’t exactly tackle all of that in a 75 question survey.
From programming languages used for data analysis like Python and R to collaboration platforms like the Github and the Open Science Framework (OSF) to writing tools like LaTex and Zotero to data sharing tools like Dash, figshare, and Zenodo, there are A LOT of different methods and tools that fall under the category of open science. Some of them worked for our project, some of them didn’t.
Data Analysis Tools
As both an undergraduate and graduate student, all of my research methods and statistics courses involved analyzing data with SPSS. Even putting aside the considerable (and recurrent) cost of an SPSS licence, I wanted to go a different direction in order to get some first-hand experience with the breadth of analysis tools that have been developed and popularized over the last few years.
I thought about trying my hand at a Jupyter notebook, which would have allowed us to share all of our data and analyses in one go. However, I also didn’t want to delay things as I taught myself how to work within a new analysis environment. From there, I tried a few “SPSS-like” applications like PSPP and Jamovi and would recommend both to anyone who has a background like mine and isn’t quite ready to start writing code. I ultimately settled on JASP because, after taking a cursory look through our data using Excel (I know, I know), it was actually being used by the participants in our sample. It turns out that’s probably because it’s really intuitive and easy to use. Now that I’m not in the middle of analyzing data, I’m going to spend some time learning other tools. But, while I do that, I’m going to to keep using and recommending JASP.
From the very beginning, we planned on making our data open. Though I wasn’t necessarily thinking about it at the time, this turned out to be another good reason to try something other than SPSS. Though there are workarounds, .sav is not exactly an open file format. But our plan to make the data open not only affected the choice of analysis tools, it also affected how I felt while running the various statistical tests. One one hand, knowing that other researchers would be able to dive deep into our data amplified my normal anxiety about checking and re-checking (and statcheck-ing) the analyses. On the other hand, it also greatly reduced my anxiety about inadvertently relegating an interesting finding to the proverbial file-drawer.
Collaboration Tools
When we first started, it seemed sensible to create a repository on the Open Science Framework in order to keep our various files and tools organized. However, since our collaboration is between just two people and there really aren’t that many files and tools involved, it became easier to just use services that were already incorporated in our day-to-day work- namely e-mail, Skype, Google Drive, and Box. Though I see how it could be potentially useful for a project with more moving parts, for our purposes it mostly just added an unnecessary extra step.
Writing Tools
This is where I restrain myself from complaining too much about LaTeX. Personally, I find it a less than awesome platform for doing any kind of collaborative writing. Since we weren’t writing chunks of code, I also couldn’t find an excuse to write the paper in R Markdown. Almost all of the collaborative writing I’ve done since graduate school has been in Google docs and this project was no exception. It’s not exactly the best when it comes to formatting text or integrating with tables and figures, I haven’t found a better tool for working on a text with other people.
We used a Mendeley folder to share papers and keep our citations organized. Zotero has the same functionality, but I personally find Mendeley slightly easier to use. In retrospect, we could also have used something like the F1000 Workspace that has a more direct integration with Google docs.
This project is actually the first time I’ve published on a preprint. Like making our data open, this was the plan all along. The formatting was done in Overleaf, mostly because it was a (relatively) user friendly way to use LaTeX and I was worried our tables and figures would break the various MS Word bioRxiv templates that are floating around. Similar making our data open, planning to publish a preprint had a impact on the writing process. I’ve since notices a typo or two, but knowing that people would be reading our preprint only days after its submission made me especially anxious to check the spelling, grammar, and the general flow of our paper. On the other hand, it was a relief to know that the community would be able to read the results of a project that started at the very beginning of my postdoc before it’s conclusion.
Data Sharing Tools
Our survey and data are both available via figshare. More specifically, we submitted our materials to Kilthub, Carnegie Mellon’s instance of figshare for institutions. For those of you out there currently raising an eyebrow, we didn’t submit to Dash, UC3’s data publication platform, because of an agreement outlined when we were going through the IRB process. Overall, the submission was relatively straightforward, through the curation process definitely made me consider how difficult it is to balance adding proper metadata and documentation to a project with the desire (or need) to just get material out there quickly.
A few more thoughts on working openly
More than once over the course of this project I joked to myself, my collaborator, or really to anyone that would listen that “This would probably be easier or quicker if we could just do it the old way.”. However, now that we’re at a point where we’ve submitted our paper (to an open access journal, of course), it’s been useful to look back on what it has been like to use these different open science methods and tools. My main takeaways are that there are a lot of ways to work openly and that what works for one researcher may not necessarily work for another. Most of the work I’ve done as a postdoc has been about meeting researchers where they are and this process has reinforced my desire to do so when talking about open science, even when the researcher in question is myself.
Like our study participants, who largely reported that their data management practices are motived and limited by immediate practical concerns, a lot of our decisions about open which open science methods and tools to apply were heavily influenced by the need to keep our project moving forward. As much as I may have wanted to, I couldn’t pause everything to completely change how I analyze data or write papers. We committed ourselves to working openly, but we also wanted to make sure we had something to show for ourselves.
Additional Reading
Borghi, J. A., & Van Gulick, A. E. (2018). Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers. bioRxiv.
We Are Talking Loudly and No One Is Listening
“Listening is not merely not talking, though even that is beyond most of our powers; it means taking a vigorous, human interest in what is being told to us” — Alice Duer Miller
A couple of months ago I wrote about how we need to be advocating for data sharing and data management with more focus on adoption and eliminate discussions about technical backends. I thought this was the key to advocating for getting researchers to change their practices and make data available as part of their normal routines. But, there’s more than just not arguing over platforms that we need to change — we need to listen.
We are talking loudly and saying nothing.
I routinely visit campuses to lead workshops on data publishing (as train the trainers style for librarians and for researchers). Regardless of the material presented, there are always two different conversations happening in the room. At each session, librarians pose technical questions about backend technologies and integrations with scholarly publishing tools (i.e. ORCiD). These are great questions for a scholarly publishing conference but confusing for researchers. This is how workshops start:
Daniella “Who knows what Open Access is?”
<50% researchers in room raised hands
Daniella “Has anyone here been asked to share their data or understand what this means?”
<20% researchers in room raised hands
Daniella “Does anyone here know what an ORCiD is or have one?”
1 person total raised their hand
We are talking too loudly and no one is listening.
We have characterized ‘Open Data’ as successful because we have incentives, and authors write data statements, but this misconception has allowed the library community to focus on scholarly communications infrastructure instead of continuing to work on the issue at hand: sharing research data is not well understood, incentivized, or accessible. We need to focus our efforts on listening to the research community about what their processes are and how data sharing could be a part of these, and then we need to take this as guidance in advocating for our library resources to be a part of lab norms.
We need to be focusing our efforts on education around HOW to organize, manage, and publish data.
Change will come when organizing data to be shared throughout the research process is a norm. Our goal should be to grow adoption of sharing and managing data and as a result see an increase in researchers knowing how to organize and publish data. Less talk about why data should be available, and more hands-on getting research data into repositories, in accessible and researcher-desirable ways.
We need to only build tools that researchers WANT.
The library community has lots of ideas about what is a priority right now in the data world such as curation, data collections, and badges, but we are getting ahead of ourselves. While these initiatives may be shinier and more exciting, it feels like we are polishing marathon trophies before runners can finish a 1 mile jog. And we’re not doing a good job understanding their perspectives on running in the first place.
Before we can convince researchers that they should care about library curation and ‘FAIR’ data, we need to get researchers to even think about managing data and data publishing as a normal activity in research activities. This begins with organization at the lab level and figuring out ways to integrate data publishing systems into lab practice without disrupting normal activity. When researchers are concerned about finishing their experiments, publishing, and their career, it is not an effective or helpful solution to just name platforms they should be using. It is effective to find ways to relieve publishing pain points, and make the process easier. Tools and services for researchers should be understood as ways to make their research and publishing processes easier.
“When you listen, it’s amazing what you can learn. When you act on what you’ve learned, it’s amazing what you can change.” — Audrey McLaughlin
Librarians: this is a space where you can make an impact. Be the translators. Listen to what the researchers want, understand the research day-to-day, and translate to the infrastructure and policy makers what would be effective tools and incentives. If we focus less time and resources on building tools, services, and guides that will never be utilized or appreciated, we can be effective in our jobs by re-focusing on the needs of the research community as requested by the research community. Let’s act like scientists and build evidence-based conclusions and tools. The first step is to engage with the research community in a way that allows us to gather the evidence. And if we do that, maybe we could start translating to an audience that wants to learn the scholarly communication tools and language and we could each achieve our goals of making research available, usable, and stable.