
accuracy. The main objective of our project is to
enhance efficiency for predicting heart disease rate.
2. Related Work
Heart disease is the leading cause of death
nowadays. In the work of Umair Shafique et. al. [1],
the authors used data mining techniques, decision
tree, Naïve Bayes and Neural Network algorithms,
for which they got accuracy of 82% for Naïve Bayes
and of 78% for Decision tree. Authors used WEKA
(https://sourceforge.net/projects/weka/) machine
learning software in their work. In the work of
Sabarinathan Vachiravel et. al. [2], the authors
proposed a decision machine learning algorithm to
predict heart disease and achieved 85% accuracy
using decision tree.
In the work of Vikas Chaurasia et. al. [3], the
dataset the authors used in their work was
downloaded from UCI laboratory which has 14
different attributes out of which they only used 11
attributes to predict heart disease using Naïve Bayes
and Decision Tree machine learning algorithms.
WEKA tool was used in their work, using which
they achieved 82% accuracy for Naïve Bayes, and
84% for decision tree. In the work of N. Komal
Kumar, G. Sarika Sindhu et. al. [4], the authors
proposed machine learning algorithms such as
Random Forest, Logistic Regression, Support
Vector Machine (SVM), and K-Nearest Neighbors
(KNN) for heart disease prediction. The highest
accuracy they achieved is 85% using Random
Forest algorithm, 74% for Logistic Regression, 77%
for SVM. The lowest accuracy they got is using K-
Nearest Neighbors (KNN) of 68%. The dataset they
used in their work was unbalanced, resulting in a
need for applying sampling techniques. But they
directly applied machine learning algorithms
without filtering data in the dataset.
Malkari Bhargav et. al. [5], proposed to identify
heart disease using different ML techniques. They
collected dataset from UCI ML repository. Dataset
has total of 14 parameters like age, blood pressure,
cholesterol. He achieved highest accuracy of 96%
using ANN. 88% using Logistic regression, 83%
using Random Forest, Decision Tree 83%, 70%
using SVM, and the lowest accuracy he got is 68%
using KNN. Gayatri Ramamoorthy et. al. [6] made
use to forecast heart disease using ML models. The
authors got highest accuracy score for KNN 83%
and lowest accuracy score for SVM of 65% and for
Naïve Bayes 80%. Apurb Rajdhan et. al. [7] used
ML algorithms like decision tree, logistic
regression, random forest and naïve Bayes used to
analyze cardiovascular disease and achieved
accuracy like 81%, 85%, 90%, and 85%
respectively. Hana H. Alalawi et. al. [8] used deep
learning and machine algorithms to diagnose the
heart disease using combination of two datasets
which was collected from Kaggle and Cleveland
dataset for heart. Using which he achieved
maximum accuracy using Random Forest 92%. For
Naïve Bayes 83%, ANN 77%, KNN 71%, Logistic
regression 75%, SVM 72% respective accuracies he
got. J. Maiga et. al. [9] has developed model for
predicting heart disease which utilizes various
combination of features. Various classification
algorithms were used which are KNN, naïve Bayes
and random forest. The authors achieved highest
accuracy of 73% using Random Forests. They didn’t
achieve good accuracies because they didn’t
perform feature scaling and normalization on data.
A. Lakshmanrao et. al. [10] used several data
mining and gradient boosting algorithms in their
research for diagnosing heart disease. The authors
applied several sampling techniques for handling
unbalanced datasets. Dataset called “Framingham
heart disease” was collected from Kaggle. Dataset
has 15 features and total of 4220 patient records.
They used several boosting algorithms like
Adaboost and Gradient boosting for which they
achieved accuracy of 78% and 88% respectively.
Still for Naïve Bayes they got the lowest accuracy
of just 61% and for logistic regression 66%. Ashok
Kumar Dwivedi et. al. [11] evaluated performance
of machine learning techniques for predicting heart
disease using ten-fold cross validation naïve Bayes,
KNN, ANN, SVM, and Logistic Regression for
which accuracies they achieved is 83%, 80%, 84%,
82%, and 85% respectively.
Muhammad Saqlain et al.[12] identified heart
failure using unstructured data of heart patients. In
their work they used several ML algorithms, and
their accuracies are Logistic Regression 80%, SVM
83%, Random Forest 86% and Decision tree 86%
and neural network 84%. Hossam Meshref et. al.
[13] compared several ML algorithms like SVM,
Naïve Bayes, MLP and also did selected specific
features from the datasets for which they got
different accuracy when selecting different features
like when they selected all 14 features of datasets,
they got highest accuracy using naïve bayes of 81%
but when they selected only specific features then
they got highest accuracy using SVM. From their
research, they proved that feature selection is the
most important step for improving accuracy of
WSEAS TRANSACTIONS on BIOLOGY and BIOMEDICINE
DOI: 10.37394/23208.2022.19.1
Nikhil Bora, Sreedevi Gutta, Ahmad Hadaegh