Author

De Wang

Graduation Semester and Year

2018

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Heng Huang

Abstract

To unleash the power of big data, efficient algorithms which are scalable to millions of data are desired. Deep learning is one area that benefits from big data enormously. Deep learning uses neural networks to mimic human brains, this approach is termed connectionist in AI community. In this dissertation, we propose several novel learning strategies to improve the performance of connectionist models. Evaluation of a large neural network during inference phase requires a lot of GPU memory and computation, which will degrade user experience due to response latency. Model distillation is one way to distill the knowledge contained in one cumbersome model to a smaller one, which imitates the way that human learning is guided by teachers. We propose darker knowledge: a new method of knowledge distillation via rich targets regression. The proposed method outperforms current state-of-the-art model distillation methods proposed by Hinton et. al. A lot of high level machine learning tasks depend on model distillation, such as knowledge transfer between different neural network architectures, black box attack and defense in computer security, policy distillation in reinforcement learning, etc. Those tasks would benefit a lot from the improved model distillation method. In another work, we design a new deep neural network architecture, which enables model ensemble in a single network. The network is composed of many columns, where each column is a small computational graph that performs a series of non-linear transformation. We train multi-column branching neural networks by stochastically dropping off many columns to prevent co-adaption of columns from overfitting, and promote each column to learn different features which will enhance the aggregated representation. The new architecture exhibits ensemble property in one single model and improves the classification performance of a single neural network upon current state-of-the-art architecture. On the other hand, we studied the vulnerability of modern deep learning systems, both at the training stage and evaluation stage. At the training stage, it is possible that the training data can be contaminated by attackers with noise. This is deteriorate the recognition performance of deep learning models. We propose a new loss function that is more robust to noise input, and outperforms standard practice of neural network training. At the evaluation stage, we show that even though neural networks can achieve unprecedented high recognition accuracy on image recognition tasks, but the models are vulnerable to access attacks where attackers can generate fake identity proof easily by exploiting deployed neural networks. We show that what neural networks learn is very different from human’s vision system. Given a trained model, we can easily generate an image that will be classified into a target class with almost 100% confidence, while the image might even look like white noise to human eyes.

Keywords

Deep learning, Neural networks, Knowledge distillation, Neural network architecture, Security

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

27354-2.zip (5778 kB)

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.