Machine Learning Text Analyzer – Text Classification using Supervised and Un-Supervised Algorithms
Thesis
Text analysis is a branch of data mining that deals with text documents. This project brings to light the classification of texts into their various categories. The structured and unstructured data seems to on a high rise in this era. Thus, to be able to classify this data is important. Classification however starts from collection, preprocessing, and feature extraction. There are several techniques that can be used for text classification, but machine learning algorithms will be employed in this project. Because of the advent of Natural Language Processing, we will be able to see the need for feature extraction and selection. In this research, we will be able to see how the computer intelligently classifies text into their various categories. Emphasis will be on English language word document.