Research on video activity detection has mainly focused on identifying well-defined human activities in short video segments, often requiring large-parameter systems and extensive training datasets. This dissertation introduces a low-parameter, modular system with rapid inference capabilities, capable of being trained on limited datasets without transfer learning from large-parameter systems. The system accurately detects specific activities and associates them with students in real-life classroom videos. Additionally, an interactive web-based application is developed to visualize human activity maps over long classroom videos.
Long-term video activity detection in classrooms presents challenges, such as multiple simultaneous activities, rapid transitions, long-term occlusions, duration exceeding 15 minutes, numerous individuals performing similar activities, and differentiating subtle hand movements. The system employs fast activity initialization, object detection methods, and a low-parameter dyadic 3D-CNN classifier to process 1-hour videos in 15 minutes for typing and 50 minutes for writing activities.
Optimizing the inference pipeline involves determining optimal low-parameter 3D CNN architectures, trans-coding smaller video regions at an optimized frame rate, and using an optimal batch size for processing input videos. The resulting low-parameter model uses 18.7K parameters, requires 136.32 MB of memory, and runs at 4,620 frames per second, outperforming current methods in parameters, GPU memory usage, inference speed, and classification accuracy.
Deep learning, Neural Networks, Video analysis, Education videos, 3D-CNNs, Activity detection, Visualization
National Science Foundation under Grant No. 1613637, No. 1842220, and No. 1949230.
Level of Degree
Electrical and Computer Engineering
First Committee Member (Chair)
Prof. Marios S. Pattichis
Second Committee Member
Prof. Sylvia Celedón-Pattichis
Third Committee Member
Dr. Andreas S. Panayides
Fourth Committee Member
Prof. Manel Martinez-Ramon
Fifth Committee Member
Prof. Ramiro Jordan
Jatla, Venkatesh. "Long-term Human Video Activity Quantification in Collaborative Learning Environments." (2023). https://digitalrepository.unm.edu/ece_etds/592