Document Type
Honors Thesis
Abstract
Commercial document processing is very expensive and prone to human errors. Despite the presence of various machine learning algorithms for object classification, their performance and feasibility can vary widely based on their implementation and use case. This paper encompasses the performance evaluation of various classification algorithms for use in an automated electronic document classification system. The subject algorithms were used to classify about 1000 vectorized documents in an iterative environment. Various performance measures such as precision, recall, and F-measure were used to evaluate these algorithms. It was found that most algorithms obtained more than 95% accuracy. However, Logistic Regression was chosen as a final model because of the consistent overall performance of more than 95% precision.
Publication Date
5-1-2021
Language
English
License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.
Recommended Citation
Koirala, Sandesh, "CREATION OF CLASSIFICATION MODEL FOR USE IN AN AUTOMATED DOCUMENT CLASSIFICATION SYSTEM USING MACHINE LEARNING" (2021). 2021 Spring Honors Capstone Projects. 39.
https://mavmatrix.uta.edu/honors_spring2021/39