Author

Ashiq Imran

ORCID Identifier(s)

0000-0001-5095-7757

Graduation Semester and Year

2021

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Vassilis Athitsos

Second Advisor

Farhad Kamangar

Third Advisor

Christopher Conly

Fourth Advisor

David Levine

Abstract

Deep Neural Networks have made a significant impact on many computer vision applications with large-scale labeled datasets. However, in many applications, it is expensive and time-consuming to gather large-scale labeled data. With the limited availability of labeled data, it is challenging to obtain great performance. Moreover, in many real-world problems, transfer learning has been applied to cope with limited labeled training data. Transfer learning is a machine learning paradigm where pre-trained models on one task can be reused for another task. This dissertation investigates transfer learning and related machine learning techniques such as domain adaptation on visual categorization applications. At first, we leverage transfer learning on fine-grained visual categorization (FGVC). FGVC is a challenging topic in computer vision. FGVC is different from general recognition. It is a problem characterized by large intra-class differences and subtle inter-class differences. FGVC should be capable of recognizing and localizing the nuances within subordinate categories. We tackle this problem in a weakly supervised manner, where neural network models are getting fed with additional data using a data augmentation technique through a visual attention mechanism. We perform domain adaptive knowledge transfer via fine-tuning on our base network model. We perform our experiment on six challenging and commonly used FGVC datasets. We show competitive improvement on accuracy by using attention-aware data augmentation techniques with features derived from the deep learning model InceptionV3, pre-trained on large-scale datasets. Our method outperforms competitor methods on multiple FGVC datasets and showed competitive results on other datasets. Experimental studies show that transfer learning from large-scale datasets can be utilized effectively with visual attention-based data augmentation, obtaining state-of-the-art results on several FGVC datasets. In many applications, specifically for transfer learning, it is assumed that the source and target domain have the same distribution. However, it is hardly true in real-world applications. Moreover, direct transfer across domains often performs poorly because of domain shift. Domain adaptation, a sub-field of transfer learning, has become a prominent problem setting that refers to learning a model from a source domain that can perform reasonably well on the target domain. This dissertation investigates and proposes improvements on visual categorizations using domain adaptation. Following the context of domain adaptation, a literature review covering and summarizing the most recently proposed domain adaptation method is presented. Finally, we propose a technique that uses the adaptive feature norm with subdomain adaptation to boost the transfer gains. Subdomain adaptation can enhance the ability of deep adaptation networks by capturing the fine-grained features from each category. Additionally, we have incorporated an adaptive feature norm approach to increase transfer gains. Our method shows state-of-the-art results on standard cross-domain adaptation datasets for the object categorization task.

Keywords

Domain adaptation, Transfer learning, Deep learning, Visual classification

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS