A Naive-Bayes Text Classifier using Laplace smoothing

Document Type

Article

Abstract

The text classifier was built using kaggle.com, a website that provides GPU resources to train large amounts of data. Using this website, I created a Naive-Bayes Text Classifier in Python to classify articles on whether people agree with them. The process is based on the Naive-Bayes theory which depends on two other theorems in its namesake. The Bayes theorem states that when calculating the probability of an event one should take into account the evidence for that event to happen. The Naive theorem assumes that these evidence events are independent of each other. Following this logic, I counted the number of times a particular word appears in a document and how many times that word is present in all the documents of the dataset. These statistics allow me to categorize articles using certain words. My classifier has an accuracy of 78% and can be improved by adding more data and tuning the parameters of the model.

Disciplines

Data Science

Publication Date

5-3-2023

Language

English

License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

This document is currently not available here.

Share

COinS