Graduation Semester and Year
2021
Language
English
Document Type
Thesis
Degree Name
Master of Science in Aerospace Engineering
Department
Mechanical and Aerospace Engineering
First Advisor
Kamesh Subbarao
Abstract
Real-time object classification, localization, and detection with region-based convolution neural network (R-CNN) require high computational power, or it consumes a tremendous amount of time with the use of the available onboard computer system. Therefore, either of those ways is not practical in real-time object detection. In contrast, the YOLO-v3 network, which stands for you only look once, that uses the YOLO algorithm seems to be practical in live object classification, localization, and detection, since the YOLO algorithm seems to work faster than the sliding window algorithm used by R-CNN. Sensor fusion in this context requires estimation and association of information coming from more than one optical sensor that improves the accuracy in locating the object in the real world. The goal of this research is to implement a real-time object detection and tracking algorithm for a collision avoidance application in an autonomous ground vehicle. Transfer learning was performed to train the pre-trained artificial neural network (YOLO-v3) that uses a supervised learning algorithm, and a labeled dataset was created to train the deep neural network. Hyper-parameters such as mini-batch size, L2 regularization parameter, number of epochs, and learning rate were tuned to study the trade-off between the bias and variance of a classifier to improve the performance of the neural network. Moreover, the intrinsic calibration of the stereo camera and LiDAR was performed to find standard measurement error by re-projecting the measurement data on ground truth data. Similarly, the extrinsic calibration was done to derive standard transformation error between these two optical sensors. Finally, the object's states were fed to the centralized sensor fusion framework, which uses a linear Kalman filter, to estimate the states of the objects. The complete framework was developed in MATLAB 2020b by use of deep learning, computer vision and image processing, LiDAR, and sensor fusion and tracking toolboxes. The experimental platform was equipped with two single pin-hole cameras with Sony IMX 179 optical sensor, Intel Realsense L515 LiDAR camera, and Intel NUC as an on-board processing unit. Four 3d shapes (cylinder, cone, pyramid, and rectangular parallelopiped) were used to train the neural network for object detection. The driving scenario designer tool from MATLAB 2020b was used to simulate a driving scene and further use to test the sensor fusion framework.
Keywords
YOLO-v3, Object detection, Object position estimation, Lidar, Stereo vision, Sensor fusion
Disciplines
Aerospace Engineering | Engineering | Mechanical Engineering
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Mehta, Kamalkumar Bharatkumar, "OBJECT CLASSICATION, DETECTION AND STATE ESTIMATION USING YOLO V3 DEEP NEURAL NETWORK AND SENSOR FUSION OF STEREO CAMERA AND LIDAR" (2021). Mechanical and Aerospace Engineering Theses. 791.
https://mavmatrix.uta.edu/mechaerospace_theses/791
Comments
Degree granted by The University of Texas at Arlington