ORCID Identifier(s)

0000-0002-9124-4873

Graduation Semester and Year

2020

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Vassilis Athitsos

Abstract

This paper addresses the problem of 3D hand pose annotations using a single depth camera. Although hand pose estimation methods rely critically on accurate 3D training data, creating such reliable training data is challenging and labor intensive. We propose a semi-automatic method for efficiently and accurately labeling the 3D hand key-points in a hand depth video. The process starts by selecting a subset of frames that are representative of all the frames in the dataset and the annotator only provides an estimate of the 2D hand key-points in these selected frames. We use this information to infer the 3D location of the joints for all the frames by enforcing appearance, temporal and distance constraints. Finally, we demonstrate that our method can generate 3D training data more accurately using less manual intervention and offering more flexibility in comparison to other state-of-the-art methods.

Keywords

Hand pose estimation, Data annotation, Computer vision, Machine learning

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS