Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Vassilis Athitsos


Analyzing human motion is vital for a multitude of tasks including human-computer interaction, sign language recognition, and the assessment of cognitive disorders. Providing automatic assessments for cognitive disorders increases the accessibility and affordability of life-changing tests and treatments. For sign language recognition, automated translation systems bridge the gap between native and non-native signers. Additionally, dictionary look-up systems are helpful for native signers learning a new language. Common to both of these tasks is the reliance of fine motor function in the hands. Hand Pose Estimation methods are used to drive applications that rely on hand shape. These tasks present unique and difficult challenges which are investigated in this dissertation. We present our preliminary data analysis towards an automated assessment system for the Activate Test of Embodied Cognition (ATEC), a measurement of cognitive skills through physical activity. Evaluating cognitive function through physical movement requires data from many participants performing a wide variety of physical tasks. Collecting such a dataset is a time-consuming yet worthwhile goal. Automatically scoring the movements of each task requires that the method be robust to noise as well as accurate to ensure proper recommendations are made to experts. We evaluate three ATEC tasks designed to address attention, working memory, response inhibition, rhythm and coordination in children: Sailor Step, Ball-Drop-to-the-Beat, and Finger Tap. These tasks are specifically designed to assess lower and body accuracy, response inhibition, rhythm, and gross motor function. We present our data collection framework and evaluate baseline methods on real ATEC data. In sign languages, a periodic sign is one that contains repeated movements. Dynamic Time Warping (DTW) is often used in sign language recognition to generate a frame alignment between two input signs that provides a measure of their similarity. Alignments provided by DTW may be erroneous when the input contains periodic signs, especially when the number of periods differs between inputs. Additionally, the number of periods may change between individual signers and signs. Little work has been done to address the problem of recognizing periodic signs in the context of DTW. This work evaluates two DTW-based approaches. The first uses a newly defined periodic warping path. The second uses manual annotations to truncate periodic input to contain no more than two periods. These two methods are compared against a standard implementation of DTW. Recognition accuracy and quality of alignment are analyzed. The results motivate a need for further research in periodic sign language recognition. Deep learning based 3D hand pose estimation requires large amounts of data for training. Fully supervised methods provide reasonable accuracy but require 3D annotations for each individual frame in the dataset. Providing such annotations is an expensive task that may be infeasible for many novel applications. This dissertation investigates self-supervised methods for 3D hand pose estimation models with little or no joint annotations. The self-supervised component is based on a 3D hand model and will generate a sample of the predicted pose. Parameter choices are evaluated to determine the best representation for hand pose and shape. The predicted hand shape is compared with the input depth image as a means of supervision which can be used in parallel with joint annotations. We consider two cases: one in which limited annotations exists as well as when the model is trained with only unlabelled samples.


Computer vision, Machine learning, Cognitive evaluation, Sign language


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington