Understanding, Characterizing, and Modeling Complex Hydrologic Systems

Cyber-Infrastructure for Long-Tail Data

Introduction:
Often, individuals and small groups collect scientific data that are targeted to address specific scientific issues and have limited geographic or temporal range. However, a large number of such collections together constitute a large database that is of immense value to the scientific community. Such data are complex in that they encompass a heterogeneous collection with many dimensions, coordinate systems, scales, variables, providers, users and scientific contexts. These data have been defined as long-tail data.



NSF Funded Datanet project SEAD (Sustainable Environment Actionable Data) is developing lightweight data services designed to meet the needs of next generation sustainability projects. Our primary aim is to enable sophisticated management of heterogeneous data while dramatically lowering the cost and effort required to curate and preserve data for long-term community use. Critical Zone Observatory is one of the primary communities supported by SEAD.

NSF Funded DIBBS: Brown Dog project is developing service that will allow for past and present unstructured data to be utilized for scientific investigations. “Unstructured” data is viewed broadly as comprising of a collection of heterogeneous data with formats that reflect temporal and disciplinary legacies, data from emerging low cost open hardware based sensors and embedded sensor networks that lack well defined metadata and sensor characteristics, as well as data that are available as maps, images and text. Investigation of Critical Zone processes is one of the scientific focus areas of the project.

More information:
1. The Sustainable Environment - Actionable Data Project
2. The Brown Dog Project
3. The Geo-Semantics Project