The thesis discusses the need for a 3D world model reconstruction from raw video frames using 2D projective geometry. We propose a computer-aided approach to reconstructing a 3D speaker geometry from classroom videos of students learning Python.
The proposed method uses a transformer model to detect line candidates. Once the users identify lines corresponding to three orthogonal directions, the method computes the three vanishing points and the camera matrix. The method identifies the student’s mouths based on face landmark detection. After the estimates of the projections of the students’ mouths on the table are verified by the users, the proposed approach reconstructs the 3D speaker geometry for students whose mouths are visible.
The performance of the method is tested on synthetic images and real-life classroom images. The results show promising results for 3D table reconstruction. Furthermore, the 3D speaker geometry can be reconstructed without any projection corrections for the cases where the speakers are visible on the left and right sides of the table.
Level of Degree
Electrical and Computer Engineering
First Committee Member (Chair)
Second Committee Member
Third Committee Member
Janampa Rojas, Sebastian Alonso. "3D Speaker Geometry Inference from Digital Video using 2D Projective Geometry." (2023). https://digitalrepository.unm.edu/ece_etds/581