Tag: history

The Significance of Managing Research Data

Some of the most influential research tools of the last century were created to ensure the quality of beer and extrapolate the results of agriculture experiments conducted in the English countryside. Though ostensibly about the placement of a decimal point, an ongoing debate about the application of these tools also provides a window for understanding what it actually means to manage research data.

The p-value: A very quick introduction

Though now ubiquitous in experiment-based research, statistical techniques for extending inferences from small sample (e.g. the participants in a research study) to larger populations are actually a relatively recent invention. The t-test, an early and still widely used example of “small sample” statistics was developed by William Sealy Gossett in the early 20th century as an economical way of ensuring the quality of stout. Several years later, while assisting with long-term experiments on wheat and grass at Rothamsted Experimental Station, Ronald Fisher would build on the work of Gosset and others to develop a statistical framework based around the idea of comparing observations to the null hypothesis- the position that there is no significant difference between two or more specified sets of observations.

In Fisher’s significance testing framework, devices like t-tests are tests of the null hypothesis. The results of these tests indicate the likelihood of observing a result when the null hypothesis is true. The logic is a little tricky, but the core idea is that these tests give researchers a way of understanding the likelihood that their data is the result of sampling or experimental error. In quantitative terms, this likelihood is known as a p-value. In his highly influential 1925 book, Statistical Methods for Research Workers, Fisher would introduce an informal threshold for rejecting the null hypothesis: p < 0.05.

In one of the most influential sentences in modern research methodology, Ronald Fisher describes p = 0.05 as a convenient point for judging the significance of a statistical test. From: Fisher, R.A. (1925). Statistical Methods for Research Workers.

Despite the vehement objections of all three, Fisher’s work would later be synthesized with that of statisticians Jerzy Neyman and Egon Pearson into a suite of tools that are still widely used in many fields of research. In practice, p < 0.05 has since become a one-size-fits-all indicator of success. For decades it has been acknowledged that work that meets this criterion is generally more likely to be reported in the scholarly literature while work that doesn’t is generally relegated the proverbial file drawer.

Beyond p < 0.05

The p < 0.05 threshold has become a flashpoint the ongoing conversation about research practices, reproducibility, and replicability. Heated conversations about the use and misuse of p-values have been ongoing for decades, but over the summer a group of 72 influential researchers proposed a seemingly simple step forward- change the threshold from 0.05 to 0.005. According to the authors, “Reducing the p-value threshold for claims of new discoveries to 0.005 is an actionable step that will immediately improve reproducibility.”.

As of this writing, two responses have been published. Both weigh the pros and cons of p < 0.005 and argue that the placement of a decimal point is less of a problem than the uncritical use of a single one-size-fits-all threshold across many different circumstances and fields of research. Both end on calls for greater transparency and stronger justifications for how decisions related to research design and statistical practice are made. If the initial paper proposed changing the answer from p < 0.05 to 0.005, both responses highlight the necessity of changing the question from one that is focused on statistics to one that incorporates research data management (RDM).

Ensuring that data can be used and evaluated in the future is one of the primary goals of RDM. For example, the RDM guide we’re developing does not have a space for assessing p-values. Instead, its focus is assessing and advancing practices related to planning for, saving, and documenting data and other research products. Such practices come with their own nuance, learning curves, and jargon, but are important elements to any effort to ensure that research decisions are transparent and justified.

Resources and Additional Reading

Benjamin, D. J., Berger, J. O., Johannesson, M., Nosek, B. A., Wagenmakers, E. J., Berk, R., … & Cesarini, D. (2017). Redefine statistical significance. Nature Human Behaviour. doi: 10.1038/s41562-017-0189-z

Lakens, D., Adolfi, F. G., Albers, C. J., Anvari, F., Apps, M. A. J., Argamon, S. E., … Zwaan, R. A. (2017). Justify your alpha: A response to “Redefine statistical significance”PsyArxiv preprint. doi: 10.17605/OSF.IO/9S3Y6

McShane, B. B., Gal, D., Gelman, A., Robert, C., & Tackett, J. L. (2017). Abandon statistical significance. arXiv preprint. arXiv: 1709.07588.

Sterling, T. D. (1959). Publication decisions and their possible effects on inferences drawn from tests of significance—or vice versaJournal of the American Statistical Association54(285), 30-34. doi: 10.1080/01621459.1959.10501497

Rosenthal, R. (1979). The file drawer problem and tolerance for null resultsPsychological Bulletin86(3), 638-641. doi: 10.1037/0033-2909.86.3.638

Data Publishing–the First 500 Years

Data publishing is the new hot topic in a growing number of academic communities.  The scholarly ecosophere is filled with listserv threads, colloquia and conference hallway chats punctuated with questions of why to do it, how to do it, where to do it, when to do it and even what to call this seemingly new breed of scholarly output.  Scholars, and those who provide the tools and infrastructure to support them, are consumed with questions that don’t seem to have easy answers, and certainly not answers that span all disciplines.  How can researchers gain credit for the data they produce, separate from and in addition to the analysis of those data as articulated in formal publications?  How can scholars researching an area find relevant data sets within and across their own disciplines?  How are data and methodologies most effectively reviewed, validated, and corrected?  How are meaningful connections maintained between different versions, iterations, and “re-uses” of a given set of data?

The high-pitched level of debate on these topics is surprising in some ways given that datasets at least in certain fields have been readily available for awhile. The Inter-University Consortium for Political and Social Research (ICPSR) has been allowing social scientists to publish or find datasets since the early 1960s.  Great datasets gathered and published under institutional auspices include UN Data, the UNESCO Statistical Yearbook, and the IMF Depository library program.  Closer to home is the United States’ Federal Depository Library program, which since its establishment in 1841 has served as a distribution mechanism to ensure public access to governmental documents and data.

While these outlets are only viable solutions for some disciplines, their presence started me down a path exploring the history of data publishing in an effort to try to gain some perspective on the challenges we are facing today. Somewhat surprisingly, data publishing, conducted in a manner that would be recognized by today’s scholars, has been occurring for almost half a millennium. Yes, that’s right; we are now 500 years into producing, analyzing and publishing data.

These early activities centered on demographic data, presumably in an effort to identify and understand the dramatic patterns of life and death. Starting in the late 1500’s and prompted by the Plague, “Bills of Mortality” recording deaths within London began to be published and soon continued on a weekly basis. That raw data generation got noticed by a community-minded draper, the extremely bright, but non-university affiliated London resident John Graunt, who was inspired to gather those numerical lists, turn them into a dataset, analyze those data (looking for causes of death, ages of death, comparisons of rates between London and elsewhere, etc.) and publish both the dataset and his findings regarding population patterns in a groundbreaking work in 1662 “Natural and Political Observations Mentioned in a Following Index, and made up on the Bills of Mortality.” The work was submitted to the Royal Society of Philosophers, which recognized its merit and inducted the author into its fellowship. Graunt continued to extend his data and analysis, publishing new versions of each in subsequent years. Thus was born the first great work (at least in the Western world) of statistical analysis, or “political arithmetic” as it came to be called at that time.

Moving from the 16th and 17th centuries to the 18th brings us to another major point in data publishing history with Johann Sussmilch of Germany. Sussmilch was originally a cleric involved in a variety of intellectual pursuits, though unaffiliated with a university, at least initially. Sussmilch’s interests included theology, statistics and linguistics. He was eventually appointed to the Royal Academy of Sciences and Fine Arts for his linguistic scholarship. Sussmilch’s great work was the “Divine Order,” an ambitious effort to collect detailed data about the population in Prussia in order to prove his religious theory of “Rational Theology.” In other words, Sussmilch was engaged in a basic research program–he had a theory, formed a research question, collected the data required to test that theory, analyzed his data, and then published his results along with this data.

The rigorous quality of Sussmilch’s work (both the data and the analysis) elevated it far beyond his original and personal religious motivations, leading it to have a wide impact throughout parts of Europe. It became a focal point of exchange between scholars across countries and prompted debate over his data collection methodology and interpretation. Put another way, Sussmilch’s work inspired his colleagues to engage in the modern model of “scholarly communication” – engaging in a spirited critical dialogue which in turn resulted in changes to the next edition of the work (for instance, separate tables for immigration and racial data). Published first in 1741, it was updated and reprinted six times through 1798.

In this earlier time, as in our own, the drive to engage with other intellectuals was paramount. Publishing, sharing, critiquing and modifying data production efforts and analysis was seemingly as much a driving force among this community as it is among the scholars of today. Researchers of the 17th and 18th centuries dealt with issues of attribution, review of data veracity and analytical methodology and even versioning. The surprising discovery of apparent similarities across such a large gulf of time prompts many questions. If data could be published and shared centuries ago, why are we faced with such tremendous challenges to do the same today? Are we overlooking approaches from the past that could help us today? Or are we glossing over the difficulties of the past?

More research would have to be done to answer these questions thoroughly, but perhaps a gesture can be made in that regard by identifying some of the contrasting aspects between yesterday and today. Taking the examples from above as a jumping off point, perhaps the most striking difference between past activities and the goals articulated in conversations about data publishing today is that the data publication efforts of the past were accompanied by an equally important piece of analysis, and the research community was interested in both. The conclusions drawn from the data were held to scrutiny as were the data and data collection methods that provided their foundation. All of the components of the research were of concern. These scholars were not interested in publishing the data on their own, but rather wanted to present them along with their arguments, with each underscoring the other.

Another difference is the changing relationships between individual researchers and the entities that support them. Not only do we have governments and academic institutions, but we have a new contemporary player, the corporation, which is driven by a substantially different motivation from entities of past ages. In addition, a broader range of disciplines is now concerned with data publication, and perhaps those disciplines face stumbling blocks not at issue for the social scientists working with demographic and public health data. Given the known heterogeneity of scholarly communication practices across different fields, there seems to be no reason to think that data publishing needs, expectations and concerns would not also vary. And of course, the most obvious difference between then and now is with tools and technology. Have those advancements altered fundamental data publishing practices and if so, how?

These are interesting, but complex questions to pursue. Fortunately, what the above examples of our data publishing antecedents have hopefully revealed is that there are meaningful touchstones to use as reference points as we attempt to address these points. Data publishing has a rich, resonant past stretching back hundreds of years, providing us with an opportunity to reach into that past to better understand the trajectory that has brought us to this moment, thereby helping us more effectively grapple with the questions that seem to confound us today.

The Science of the DeepSea Challenge

Recently the film director and National Geographic explorer-in-residence James Cameron descended to the deepest spot on Earth: the Challenger Deep in the Mariana Trench.  He partnered with lots of sponsors, including National Geographic and Rolex, to make this amazing trip happen.  A lot of folks outside of the scientific community might not realize this, but until this week, there had been only one successful descent to this the trench by a human-occupied vehicle (that’s a submarine for you non-oceanographers).  You can read more about that 1960 exploration here and here.

I could go on about how astounding it is that we know more about the moon than the bottom of the ocean, or discuss the seemingly intolerable physical conditions found at those depths– most prominently the extremely high pressure.  However what I immediately thought when reading the first few articles about this expedition was where are the scientists?

Before Cameron, Swiss Oceanographer Piccard and Navy officer Marsh went down in it to the virgin waters of the deep. From www.history.navy.mil/photos/sh-usn/usnsh-t/trste-b

After combing through many news stories, several National Geographic sites including the site for the expedition, and a few press releases, I discovered (to my relief) that there are plenty of scientists involved.  The team that’s working with Cameron includes scientists from Scripps Institution of Oceanography (the primary scientific partner and long-time collaborator with Cameron),  Jet Propulsion Lab, University of Hawaii, and University of Guam.

While I firmly believe that the success of this expedition will be a HUGE accomplishment for science in the United States, I wonder if we are sending the wrong message to aspiring scientists and youngsters in general.  We are celebrating the celebrity film director involved in the project in lieu of the huge team of well-educated, interesting, and devoted scientists who are also responsible for this spectacular feat (I found less than 5 names of scientists in my internet hunt).  Certainly Cameron deserves the bulk of the credit for enabling this descent, but I would like there to be a bit more emphasis on the scientists as well.

Better yet, how about emphasis on the science in general?  It’s a too early for them to release any footage from the journey down, however I’m interested in how the samples will be/were collected, how they will be stored, what analyses will be done, whether there are experiments planned, and how the resulting scientific advances will be made just as public as Cameron’s trip was.  The expedition site has plenty of information about the biology and geology of the trench, but it’s just background: there appears to be nothing about scientific methods or plans to ensure that this project will yield the maximum scientific advancement.

How does all of this relate to data and DCXL? I suppose this post falls in the category of data is important.  The general public and many scientists hear the word “data” and glaze over.  Data isn’t inherently interesting as a concept (except to a sick few of us).  It needs just as much bolstering from big names and fancy websites as the deep sea does.  After all, isn’t data exactly what this entire trip is about?  Collecting data on the most remote corners of our planet? Making sure we document what we find so others can learn from it?

Here’s a roundup of some great reads about the Challenger expedition: