Electrical and Computer Engineering ETDs

Publication Date

Spring 4-14-2023

Abstract

The thesis discusses the need for a 3D world model reconstruction from raw video frames using 2D projective geometry. We propose a computer-aided approach to reconstructing a 3D speaker geometry from classroom videos of students learning Python.

The proposed method uses a transformer model to detect line candidates. Once the users identify lines corresponding to three orthogonal directions, the method computes the three vanishing points and the camera matrix. The method identifies the student’s mouths based on face landmark detection. After the estimates of the projections of the students’ mouths on the table are verified by the users, the proposed approach reconstructs the 3D speaker geometry for students whose mouths are visible.

The performance of the method is tested on synthetic images and real-life classroom images. The results show promising results for 3D table reconstruction. Furthermore, the 3D speaker geometry can be reconstructed without any projection corrections for the cases where the speakers are visible on the left and right sides of the table.

Document Type

Thesis

Language

English

Degree Name

Computer Engineering

Level of Degree

Masters

Department Name

Electrical and Computer Engineering

First Committee Member (Chair)

Marios Pattichis

Second Committee Member

Víctor Murray

Third Committee Member

Manel Martinez-Ramon

Share

COinS