Graduation Semester and Year

2010

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Vassilis Athitsos

Abstract

American Sign Language (or ASL) is the dominant sign language of deaf people in United States and parts of Canada. 500,000 to 2 million people use ASL as their primary language in United States. ASL uses hands, face and body, with constantly changing movements and orientations. Since the language is based on gestures, not a printed alphabet, it gets difficult to know the associated meaning given a video. There are multimedia tools and dictionaries available to view a sign video for a given word but there are no dictionaries available which, given a sign video will respond with corresponding word. This egged on development of video based lookup in ASL dictionary. The vision is to have a system in which a user will be able to lookup the meaning of an ASL sign simply by performing gesture in front of a video camera synced to a computer. The computer will compare the unknown sign with a database of signs to identify the most likely matches. In existing ASL lexicon project a user submits a query sign video and the application finds the most similar signs from the system database. The existing system evaluates the similarity between the query video and every sign video in the video database, using Dynamic Time Warping (DTW) distance. DTW is an algorithm for measuring similarity between two sequences which may vary in time or speed. DTW is a method that allows a computer to find an optimal match between two given sequences (e.g. time series) with certain restrictions. The sequences are warped non- linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension. The existing ASL lexicon project uses the similarity in hand locations and orientations to lookup a gesture in dictionary of ASL signs. The ability of DTW to cater to temporal misalignments helps us recognize signs differing in time or speed. ASL periodic signs are signs which have repetition of an action. The number of times this action is repeated is signer ‘s discretion. DTW in such cases will still attempt to align input video with the dictionary sign video. The alignment will not be meaningful, if the number of times an action is repeated in input video differs from dictionary sign video. Since DTW in such case results in non-meaningful association, it ultimately results in poor similarity responses. The paper attempts to correct this problem of ‘ Incorrect Period Matching ‘. The paper contributes by defining a protocol for annotating periodic signs and introduces a method for improving system accuracy on such signs. It builds an informative database for periodic video signs on an ASL lexicon dataset of 1113 unique signs. It captures the temporal information (start and end of period) for all the periods executed in each periodic sign video. This paper provides a mechanism to generate periods synthetically. We use the periodic temporal information of the last period to create subsequent periods for the training video. We successfully corrected the problem which spurred while using DTW on periodic signs, by synthetically generating periods.

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS