Graduation Semester and Year
2020
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Vassilis Athitsos
Abstract
Accurate hand segmentation is vital in many applications in which the hands play a central role, such as sign language recognition, action recognition, and gesture recognition. A relatively unexplored obstacle to correct hand segmentation is when the hand overlaps the face. The shortage of a dataset for this research area has been one motivation for this work. However, this dissertation investigates and proposes improvements for the hand-over-face segmentation task. Toward an in-depth study of the hand segmentation problem, the work presented in this dissertation will yield several contributions. First, it introduces a survey on sign language recognition systems using mobile phones, which shows a recent practical example of the need for the hand segmentation dataset and comprehensive research work. Second, following the context of this work, a literature review that covers and summarizes all available hand segmentation datasets will be presented. Besides, I provide a public dataset (VLM-HandOverFace) for hand segmentation task. This newly constructed dataset contains 4384 labeled frames and includes color, depth, infrared streams recorded by Kinect. The performance of the VLM-HandOverFace dataset is evaluated using several state-of-the-art architectures. Furthermore, this dissertation proposes the Multi-level Pyramid Scene Parsing Network (MPSP-Net) for semantic segmentation. I also provide a thorough discussion and evaluations of the new modeled-solution about the unique characteristics that demonstrate its applicability for the hand-over-face segmentation challenge. Several experiments were conducted to examine MPSPNet using two object segmentation datasets and two hand segmentation datasets. The results show that the proposed method achieves at least a 6% improvement in mIOU compared with all state-of-the-art methods. Finally, various experiments conducted to measure the impact of including temporal motion information on MPSPNet.
Keywords
Computer vision, Machine learning, CNN, DNN, MPSPNet
Disciplines
Computer Sciences | Physical Sciences and Mathematics
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Ghanem, Sakher, "Hand-Over-Face Segmentation" (2020). Computer Science and Engineering Dissertations. 351.
https://mavmatrix.uta.edu/cse_dissertations/351
Comments
Degree granted by The University of Texas at Arlington