(index page)
Ocean Health and Data Sharing
How do we go about measuring the health of complex ecosystems, especially with humans – the ultimate complicating factor? We use good data and smart science, that’s how. Of course, this “good data” can’t be collected by any one scientist. Instead, we should rely on years of data collection by scientists who are experts what they do.
If you are a science news junkie like me, you might have noticed a lot of recent buzz about the Ocean Health Index. This is an incredible project with one simple goal: measure the health of the oceans. Of course, the word simple is hyperbole at best in this case. Oceans (especially coastal oceans with nearby humans) are about as complicated as nature can get. There are:
- biological factors like fishes, algae, invertebrates, and marine mammals
- geological components like sand, mud, rocks, and cliffs
- physical factors like tides, storms, and upwelling
- chemical factors like pollutants, water-rock interactions, and freshwater runoff
Layer on top of this system the biggest impact of all: humans. We have cruise ships, sand castles, scuba divers, marinas, tourists, surfers, cliff divers, and fishermen, to name a few. How do you manage to take all of this into account to measure the health of oceans?
Basically the OHI has factors in 10 different “public goals”, including tourism, clean waters, biodiversity, food provision, and coastal economies. They have equations for calculating a number from 0 to 100 for each goal, and taken together, these numbers indicate the health of a particular region of the ocean. The equations take into account both the current status and likely future status of each goal, making the OHI robust for prediction.
So what goes into these equations? DATA, of course! Many (many) scientists contributed data to the Ocean Health Index. Check out the main paper on OHI and its supplemental material. Relevant datasets were compiled to provide parameters for the goal equations. The more data the better the model (usually), and the OHI folks took that to heart – the list of contributing datasets is daunting.
I don’t have to ask, but I know that compiling those datasets was no easy feat. Poor documentation (i.e. metadata), bad file formats, icky table organization, and missing information likely plagued the OHI researchers pulling this information together. And here’s the DataUp connection: if only they had all used a tool to create well documented data that follows best practices for data management!

I’m really excited to see this OHI released. The website is pretty amazing, and definitely NOT geared towards the nerdy science types (although we can find that raw data pretty easily if we want it!). Go play with it, share it with your family and friends, and help raise awareness about the importance of well documented datasets to society’s well being. Share with your colleagues and lab mates to emphasize that their data might be used in unimaginable ways in the future – which means good data management is critical.
More on the OHI:
- Conservation International article
- New York Times Green Blog
- News piece from National Geographic
- Scientific American article authored by Halpern (genius behind OHI)
- Nature News piece
The Geek SXSW

Seeing the acronym SXSW is likely to elicit one of two responses: (1) Huh? or (2) OMGOMGOMG!! If you are in the former camp, rest easy: I will explain. SXSW is the abbreviation for South by Southwest, a big old festival/conference down in my home state of Texas. SXSW has been going strong in Austin since 1987 and gets bigger every year (60,000 registrants this year!). Read more about it on Wikipedia, or check out SXSW’s stats pdf to be wowed and amazed at how huge this thing is. From the SXSW website:
The South by Southwest® (SXSW®) Conferences & Festivals (March 8-17, 2013) offer the unique convergence of original music, independent films, and emerging technologies. Fostering creative and professional growth alike, SXSW® is the premier destination for discovery.
So why am I telling you this? Because I need your help to get to SXSWi! In typical innovative fashion, those who would like to present at SXSWi need to first make it through a rigorous selection process that includes votes from the public. I need your vote for my proposed panel! What am I proposing, you ask? Here is the description:
Funders, researchers, and public stakeholders increasingly see the need to better communicate and curate ever expanding bodies of research data. This panel will bring together stakeholders in scientific data community, including a researcher, a librarian, and a federated data repository director. Before the panel commences, we will describe the current landscape of scientific data and its management, including publication, citation, archiving, and sharing of data. The panel discussion will focus on identifying gaps and unmet needs in order to help chart a path for future policy, service, and infrastructure development. Questions will include: How has the handling of scientific data changed in the last few years? What should researchers know about properly organizing, managing, and sharing their data? How can data centers, IT professionals, developers, librarians, and others help researchers with their data? Why should researchers consider sharing and/or publishing their data? How do researchers benefit from implementing data citation practices?
I’ve lined up four great speakers:
- Bill Michener, DataONE director and awesome guy (also a fellow music lover – not a coincidence)
- Jeff Dozier, Scientist and Professer at UC Santa Barbara who works on Snow Hydrology, Earth System Science, and Remote Sensing
- Andrew Sallans, Head of the Scientific Data Consulting Group at University of Virginia Library
- Me! Scientist-turned-data geek
This promises to be a great group of folks who will undoubtedly provide for an entertaining and lively discussion; hopefully afterward we can celebrate our success over a few Shiner Bocks while jamming to some tunes at the SXSW festival.
Voting from the public accounts for about 30% of the decision-making process for SXSW panel programming, so we really want to make the part we can control count. Please vote by clicking the icon below and share with others!
Curious to hear what Explosions in the Sky sounds like? Here ya go. You’re welcome.
[youtube http://www.youtube.com/watch?v=iFwOmxP56-g]
DataUp at #ESA2012
<a href=”http://datapub.files.wordpress.com/2012/08/img_0502.jpg”><img class=”alignleft wp-image-1037″ title=”IMG_0502″ src=”http://datapub.files.wordpress.com/2012/08/img_0502.jpg?w=1024&h=1024″ alt=”” width=”368″ height=”368″ /></a>I’m spending this week in Portland to attend the <a href=”http://esa.org/portland/” target=”_blank”>Ecological Society of America’s annual meeting</a>. If you are a long-timer of the blog, you might remember I <a title=”ESA 2011 Meeting, Austin TX” href=”http://datapub.cdlib.org/?p=50″ target=”_blank”>was at ESA last year</a> to collect requirements for DataUp (then DCXL) and reported on <a title=”Quantitative Results From the ESA Conference” href=”http://datapub.cdlib.org/?p=84″ target=”_blank”>data sharing among ecologists</a>. Unfortunately I’m not presenting on DataUp specifically, but rather I’m here to tout the merits of <a href=”http://www.dataone.org” target=”_blank”>DataONE</a>. (This includes serving as one of the <a href=”http://en.wikipedia.org/wiki/Promotional_model” target=”_blank”>DataONE booth babes</a>). I’m anxious to showcase DataUp to the ESA crowd, but our public release isn’t until September… so I’m resisting the urge to show off the tool. With <a title=”Looking for something? DataONE can help” href=”http://datapub.cdlib.org/?p=1012″ target=”_blank”>DataONE going live</a> a few weeks ago, there is plenty to talk about with ESA attendees.
That said, there are all kinds of great things to see at ESA this week. DataONE sponsored a <a href=”http://eco.confex.com/eco/2012/webprogrampreliminary/Session8028.html” target=”_blank”>workshop on data management</a> this past Sunday where there were quite a few questions about DataUp. I also participated in a <a href=”http://eco.confex.com/eco/2012/webprogrampreliminary/Session8029.html” target=”_blank”>session on data management planning </a>yesterday, and will take part in a <a href=”http://eco.confex.com/eco/2012/webprogrampreliminary/Session7962.html” target=”_blank”>panel discussion today</a> over lunch about the culture of data sharing in Ecology. Rest assured that DataUp will be mentioned during that discussion! Other must-sees at ESA: the <a href=”http://eco.confex.com/eco/2012/webprogram/Session7618.html” target=”_blank”>session I organized with Josh Tewksbury and Steph Hampton on the future of ecology</a> (Wednesday) and the <a href=”http://esa.ropensci.org” target=”_blank”>workshop on using R to find ecological data</a> (Thursday).
If you happen to be at ESA this year, stop by the DataONE booth and say hello. You can use the new <a href=”https://cn.dataone.org/onemercury/” target=”_blank”>ONE-Mercury search engine</a> to search DataONE repositories for data, chat with me about DataUp, and eat tasty chocolates.
Despite the NSF requirement for data management plans, I get the feeling that folks still haven’t gotten on board with learning about data management and sharing. It will be fun to attend #ESA2013, showcase DataUp, and see how the culture has evolved.
Progress & Plans for DataUp Release

It was one year ago today that I moved up to the Bay Area to work on DataUp (then DCXL) in earnest. It seems fitting that this milestone be marked by some significant progress on the project. No, we haven’t released DataUp to the public yet, but we have a release date slated for this September. This is very exciting news, especially since the project got off to a bit of a slow start. We have been cooking with gas since March, however, and the DataUp tool promises to do much of what I had envisioned on my drive from Santa Barbara last year.
If you are wondering what DataUp looks like, you will need to be patient. You can, however, see some preliminary responses from our very gracious beta testers. The good news is this: most folks seem pretty happy with the tool as-is, and many offered some really great feedback that will improve the tool as we move into the community involvement phase of the development effort.
We asked 21 beta testers what they thought of DataUp features, and here are the results:
We expect that the DataUp tool will only improve from here on out, so stay tuned for our big debut in less than two months!

