Computer Science ETDs

Publication Date

Spring 5-12-2018

Abstract

The traditional methods for analyzing information in digital documents have evolved with the ever-increasing volume of data. Some challenges in analyzing scientific publications include the lack of a unified vocabulary and a defined context, different standards and formats in presenting information, various types of data, and diverse areas of knowledge. These challenges hinder detecting, understanding, comparing, sharing, and querying information rapidly.

I design a dynamic conceptual data model with common elements in publications from any domain, such as context, metadata, and tables. To enhance the models, I use related definitions contained in ontologies and the Internet. Therefore, this dissertation generates semantically-enriched data models from digital publications based on the Semantic Web principles, which allow people and computers to work cooperatively. Finally, this work uses a vocabulary and ontologies to generate a structured characterization and organize the data models. This organization allows integration, sharing, management, and comparing and contrasting information from publications.

Language

English

Keywords

Table Understanding, Information Modeling, Data Integration, Semantic Interoperability, Information Extraction, Data Science

Document Type

Dissertation

Degree Name

Computer Science

Level of Degree

Doctoral

Department Name

Department of Computer Science

First Committee Member (Chair)

Trilce Estrada-Piedra

Second Committee Member

Soraya Abad-Mota

Third Committee Member

Abdullah Mueen

Fourth Committee Member

Sarah Stith

Share

COinS