Malaria Prediction using Bayesian and other Machine Learning Techniques
Thesis
Main purpose of data mining is to extract valuable information from available data. With the enormous amount of data stored in files, databases, and repositories, in the healthcare sector, it’s increasingly important, if not necessary, developing powerful means for analysis and interpretation of such data for the extraction of knowledge that could help in decision-making. Classification is technique in data mining; it’s defined as distinguishing, assigning object to a certain class based on its similarity to previous examples of other objects in the dataset. This thesis used four classification algorithms that include Bayesian Network, Decision tree (J48), ZeroR and OneR to build the classification models on the malaria dataset and compare the models, measure their performance with WEKA-API Library imported to JDK8, and uses the ensembles method in boosting the performance of the classification algorithms. The result identified that Decision Tree (J48) yield highest accuracy of 88.4% before and after boosting followed by Naïve Bayes, with 79.9% before and after boosting is 87.0%, then OneR with accuracy of 79.8%, 87.6% before and after boosting respectively and ZeroR have lesser accuracy of 60.9% before and after boosting. The research suggested that Decision tree is the best to be used on Malaria dataset in our hospitals.