Sachit Kaul

Graduation Semester and Year




Document Type


Degree Name

Master of Science in Mechanical Engineering


Mechanical and Aerospace Engineering

First Advisor

Kamesh Subbarao


Object Detection and Recognition using Computer Vision has been a very interesting and a challenging field of study from past three decades. Recent advancements in Deep Learning and as well as increase in computational power has reignited the interest of researchers in this field in last decade. Implementing Machine Learning and Computer Vision techniques in scene classification and object localization particularly for automated driving purpose has been a topic of discussion in last half decade and we have seen some brilliant advancements in recent times as self-driving cars are becoming a reality. In this thesis we focus on Region based Convolutional Neural Networks (R-CNN) for object recognition and localizing for enabling Automated Driving Assistance Systems (ADAS). R-CNN combines two ideas: (1) one can apply high-capacity Convolutional Networks (CNN) to bottom-up region proposals in order to localize and segment objects and (2) when labelling data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific-fine-tuning, boosts performance significantly. In this thesis, inspired by the RCNN framework we describe an object detection and segmentation system that uses a multilayer convolutional network which computes highly discriminative, yet invariant features to classify image regions and outputs those regions as detected bounding boxes for specifically a driving scenario to detect objects which are generally on road such as traffic signs, cars, pedestrians etc. We also discuss different types of region based convolutional networks such as RCNN, Fast RCNN and Faster RCNN, describe their architecture and perform a time study to determine which of them leads to real-time object detection for a driving scenario when implemented on a regular PC architecture. Further we discuss how we can use such R-CNN for determining the distance of objects on road such as Cars, Traffic Signs, Pedestrians from a sensor (camera) mounted on the vehicle which shows how Computer Vision and Machine Learning techniques are useful in automated braking systems (ABS) and in perception algorithms such as Simultaneous Localization and Mapping (SLAM).


ADAS, Computer vision, Convolutional neural networks


Aerospace Engineering | Engineering | Mechanical Engineering


Degree granted by The University of Texas at Arlington