Civil Engineering ETDs

Author

Stephen Brown

Publication Date

7-2-2012

Abstract

In 2010, the National Science Foundation (NSF) implemented new guidelines for all scientists applying for grants. A Data Management Plan (DMP) is now required for all proposals in which data are created or gathered while working under the grant. Several organizations have produced templates and applications to assist with the construction of DMPs. The data plans provide a good overview of data processing and storage but do not provide any guidance for managing data during the research process. Large temporal hydrologic data sets can provide a rich insight to complex hydrologic and ecological systems. Complications arise when attempting to query and present the data in ways that are useful for exploring and validating research hypotheses. Common tools, such as Excel or Matlab, may be helpful if you know the exact sequence of data you want to analyze. Frequently, this is not the case. Looking at long term trends, adding and removing additional variables, or comparing local results to external national datasets are difficult or impossible with these tools. To overcome the limitations of current data management methods, a Consortium of Universities for the Advancement of Hydrologic Science Inc. - Hydrologic Information System (CUAHSI-HIS) server was deployed in collaboration with Earth Data Analysis Center (EDAC) and the New Mexico Experimental Program to Stimulate Competitive Research (NMEPSCoR). Data products on the server are stored in a relational database using WaterML, an XML based language introducing standardization to the hydrologic community and facilitating distribution and aggregation of hydrologic data. Four project types from different agencies have been selected to explore the process of obtaining and ingesting data into an HIS. Three of the projects are university based with different stakeholders and the fourth is a state funded project carried out by a contractor. Tools developed by CUAHSI for ingesting measurements into the database made processing the raw data straightforward. After the data were formatted properly, automated processes allowed millions of measurements to migrate from Excel files into the HIS. Aggregating the data and metadata without support from the principal investigator proved difficult. Deciphering the provenance of derived data proved exceptionally difficult from a data manager perspective with little experience in specialized disciplines. Datasets that previously required hours to download, aggregate, and visualize are can now be processed in minutes. Repetitive analysis tasks can be automated within the HIS, integrating local regional, and national datasets by spatial and temporal extent and delivered to the research team in a variety of formats. The CUAHSI-HIS components make data discovery and analysis streamlined in addition to satisfying the NSF DMP requirements.

Keywords

Information storage and retrieval systems--Hydrology--Planning, Hydrology--Data Processing, Information visualization.

Sponsors

New Mexico Experimental Program to Stimulate Competitive Research (NMEPSCoR)

Document Type

Thesis

Language

English

Degree Name

Civil Engineering

Level of Degree

Masters

Department Name

Civil Engineering

First Committee Member (Chair)

Benedict, Karl

Second Committee Member

Stone, Mark

Share

COinS