Electrical and Computer Engineering ETDs

Publication Date

Winter 12-13-2017

Abstract

Classification of human activity in raw video presents a challenging problem that remains unsolved, and is of great interest for large datasets. Though there have been several attempts at applying image processing techniques to video to recognize human activity in controlled video segments, few have attained a significant degree of success in raw videos.

Raw video classification exhibits significant challenges that can be addressed through the use of geometric information. Current techniques employ a combination of temporal information of the feature space or a combination of Convolutional and Recurrent Neural Networks (CNN and RNNs). CNNs are used for frame feature extraction and RNNs are then applied for motion vector extraction and classification. These techniques, which utilize information from the entirety of a frame, attempt to classify action based on all motion vectors and all objects found in the video. Such methods are cumbersome, often difficult to train, and do not generalize well beyond the dataset used.

This thesis explores the use of color based object detection in conjunction with contextualization of object interaction to isolate motion vectors specific to an activity sought within uncropped video. Feature extraction in this thesis differs significantly from other methods by using geometric relationships between objects to infer context. The approach avoids the need for video cropping or substantial preprocessing by significantly reducing the number of features analyzed in a single frame. The method was tested using 43 uncropped video clips with 620 video frames for writing, 1050 for typing, and 1755 frames for talking. Using simple KNN classification, the method gave accuracies of 72.6% for writing, 71% for typing and 84.6% for talking. Classification accuracy improved to 92.5% (writing), 82.5% (typing) and 99.7% (talking) with the use of a trained Deep Neural Network.

Keywords

human activity classification; context-based methods

Sponsors

This material is based upon work supported by the National Science Foundation under Grant No. 1613637 and NSF AWD CNS-1422031

Document Type

Thesis

Language

English

Degree Name

Computer Engineering

Level of Degree

Masters

Department Name

Electrical and Computer Engineering

First Committee Member (Chair)

Marios Pattichis

Second Committee Member

Manel Martinez-Ramon

Third Committee Member

Ramiro Jordan

Share

COinS