Skip to main content

(index page)

Two Altmetrics Workshops in San Francisco

Last week, a group forward-thinking individuals interested in measuring scholarly impact gathered at Fort Mason in San Francisco to talk about altmetrics. The Alfred P. Sloan Foundation funded the events at Fort Mason, which included (1) an altmetrics-focused workshop run by the open-access publisher (and leader in ALM) PLOS, and (2) a NISO Alternative Assessment Initiative Project Workshop to discuss standards and best practices for altmetrics.

In lieu of a blog post for Data Pub, I wrote up something for the folks over at the London School of Economics Impact of Social Sciences Blog. Here’s a snippet that explains altmetrics:

Altmetrics focuses broadening the things we are measuring, as well as how we measure them. For instance, article-level metrics (ALMs) report on aspects of the article itself, rather than the journal in which it can be found. ALM reports might include the number of article views, the number of downloads, and the number of references to the article in social media such as Twitter. In addition to measuring the impact of articles in new ways, the altmetrics movement is also striving to expand what scholarly outputs are assessed – rather than focusing on journal articles, we could also be giving credit for other scholarly outputs such as datasets, software, and blog posts.

So head on over and read up on the role of higher education institutions in altmetrics: “Universities can improve academic services through wider recognition of altmetrics and alt-products.”

Related Data Pub posts:

Data Citation Developments

Citation is a defining feature of scholarly publication and if we want to say that a dataset has been published, we have to be able to cite it. The purpose of traditional paper citations– to recognize the work of others and allow readers to judge the basis of the author’s assertions– align with the purpose of data citations. Check out previous posts on the topic here.

Although in the past, datasets and databases have usually been mentioned haphazardly, if at all, in the body of a paper and left out of the list of references, this no longer has to be the case.

Last month, there was quite a bit of activity on the data citation front:

  1. Importance: Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.
  2. Credit and Attribution: Data citations should facilitate giving scholarly credit and normative and legal atribution to all contributors to the data, recognizing that a single style or mechanism of atribution may not be applicable to all data.
  3. Evidence: Where a specific claim rests upon data, the corresponding data citation should be provided.
  4. Unique Identifiers: A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
  5. Access: Data citations should facilitate access to the data themselves and to such associated metadata, documentation, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
  6. Persistence: Metadata describing the data, and unique identifiers should persist, even beyond the lifespan of the data they describe.
  7. Versioning and Granularity: Data citations should facilitate identification and access to different versions and/or subsets of data. Citations should include sufficient detail to verifiably link the citing work to the portion and version of data cited.
  8. Interoperability and Flexibility: Data citation methods should be sufficiently flexible to accommodate the variant practices among communities but should not differ so much that they compromise interoperability of data citation practices across communities.

In the simplest case– when a researcher wants to cite the entirety of a static dataset– there seems to be a consensus set of core elements between DataCite, CODATA and others. There is less agreement with respect to more complicated cases, so let’s tackle the easy stuff first.

(Nearly) Universal Core Elements

Common Additional Elements

Complications

Datasets are different from journal articles in ways that can make them more difficult to cite. The first issue is deep citation or granularity, and the second is dynamic data.

Deep Citation

Traditional journal articles are cited as a whole and it is left to the reader to sort through the article to find the relevant information. When citing a dataset, more precision is sometimes necessary. An analysis is done on part of a dataset, it can only be repeated by extracting exactly that subset of the data. Consequently, there is a desire for mechanisms allowing precise citation of data subsets. A number of solutions have been put forward:

Dynamic Data

When a journal article is published, it’s set in stone. Corrections and retractions are are rare occurrences, and small errors like typos are allowed to stand. In contrast, some datasets can be expected to change over time. There is no consensus as to whether or how much change is permitted before an object must be issued a new identifier. DataCite recommends but does not require that DOIs point to a static object.

Broadly, dynamic datasets can be split into two categories: