ORCID Identifier(s)

0000-0001-9081-9252

Graduation Semester and Year

2022

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Vassilis Athitsos

Abstract

"In this thesis we focus on two instances of human behavior modeling in long untrimmed videos: drowsiness detection, and action segmentation. In the first section, we focus on drowsiness detection. Specifically, we introduce a large and public real-life dataset and a baseline temporal model to classify drowsiness into three stages of alert, low vigilant, or drowsy. In the second section, we study action segmentation in instructional videos under weak supervision. In order to save time and cost, weakly supervised methods are trained based on only video-level action sequences as opposed to a fully supervised method which is trained using frame-level labels. We study weakly-supervised action segmentation from multiple aspects. First, we present a duration model to predict the remaining duration of an ongoing action to iteratively align a given sequence of action in an input video. Second, we propose a hierarchical approach to segmentation, where top level tasks are predicted to constrain lower level atomic actions. Third, we introduce the first weakly-supervised online action segmentation model to segment streaming videos online at test time using Dynamic Programming and show its advantages over greedy sliding window approach. Finally, we present a multi-view training strategy to exploit frame-wise correspondence between multiple views as supervision for training weakly-labeled instructional videos. The experimental results on multiple public datasets show the efficacy of our algorithms."

Keywords

Action segmentation, Drowsiness detection, Weak supervision, Video understanding, Instructional videos

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS