An Evaluation of Artificial Neural Networks and Random Forests for Heart Disease Prediction

Wan Aezwani Wan Abu Bakar, Nur Laila Najwa B. Josdi, Mustafa B. Man, Yaya Sudarya Triana


Heart diseases are serious problem in many countries worldwide. In Malaysia, it has been a major killer since 1980. Many health conditions are closely related to heart disease. However, a large amount of data that medical centers have collected each year is not well-mined to find connections between them that can aid in the prognosis of heart disease. Therefore, the purpose of this study is to propose a predictive model of heart disease based on machine learning for prognosis to help individuals with symptoms to seek early advice and treatment. By following the Knowledge Discovery in Database (KDD) methodology that includes data selection, data pre-processing, data transformation, data mining, and interpretation or evaluation of acquired knowledge, this study has tested a dataset taken from UCI Machine Learning Repository. The classification of Artificial Neural Network and Random Forest was used. They were selected based on their adequacy in the medical field, particularly in the aspect of prognosis and diagnosis. The accuracy results obtained by the relevant works from previous authors are also high and reliable. This study uses a few ways to determine the maximum accuracy achieved by both algorithms: dataset splitting and K-Fold Cross-Validation. The results of the study on the test set that has been subdivided into several subsets showed that Artificial Neural Network and Random Forest produced stable accuracies by reaching 67.9% and 64.6%, respectively. The accuracy shown by the Artificial Neural Network is more stable for both subsets, training, and testing sets. In conclusion, Artificial Neural Network has been selected as the algorithm capable of working well with the Heart Disease Prediction Model, referring to the accuracy of the test set, which is slightly better than Random Forest.


Keywords: artificial neural network, accuracy, data mining, heart disease prediction, random forest.


Full Text:



BARKLEY S., STARFIELD B., SHI L., and MACINKO J. The contribution of primary care to health systems and health. In: Family medicine: The classic papers. CRC Press, Boca Raton, 2016, 191-239.

ABIODUN O. I., JANTAN A., OMOLARA A. E., DADA K. V., MOHAMED N. A., and ARSHAD H. State-of-the-art in artificial neural network applications: A survey. Heliyon, 2018, 4(11), e00938.

EDUCATIVE. What is a multi-layered perceptron? 2021.

KUMAR A. Random Forest for prediction. Towards Data Science, 2020.

BERRÍOS-TORRES S. I., UMSCHEID C. A., BRATZLER D. W., LEAS B., STONE E. C., KELZ R. R., REINKE C. E., MORGAN S., SOLOMKIN J. S., MAZUSKI J. E., and DELLINGER E. P. Centers for disease control and prevention guideline for the prevention of surgical site infection, 2017. Journal of the American Medical Association Surgery, 2017, 152(8), 784-791.

DEPARTMENT OF STATISTICS MALAYSIA OFFICIAL PORTAL. Statistics on Causes of Death, Malaysia. 2020.

BENJAMIN E. J., MUNTNER P., ALONSO A., BITTENCOURT M. S., CALLAWAY C. W., CARSON A. P., CHAMBERLAIN A. M., CHANG A. R., CHENG S., DAS S. R., and DELLING F. N. Heart disease and stroke statistics — 2019 update: a report from the American Heart Association. Circulation, 2019, 139(10), e56-528.

SUBCZYNSKI W. K., PASENKIEWICZ-GIERULA M., WIDOMSKA J., MAINALI L., and RAGUZ M. High cholesterol/low cholesterol: effects in biological membranes: a review. Cell Biochemistry and Biophysics, 2017, 75(3), 369-385.

FLORA G. D., & NAYAK M. K. A brief review of cardiovascular diseases, associated risk factors, and current treatment regimes. Current Pharmaceutical Design, 2019, 25(38), 4063-4084.

BALLA C., PAVASINI R., and FERRARI R. Treatment of angina: where are we? Cardiology, 2018, 140(1), 52-67.

BOWDEN J., & SINATRA S. T. The Great Cholesterol Myth, Revised and Expanded: Why Lowering Your Cholesterol Won't Prevent Heart Disease - and the Statin-Free Plan that Will. Fair Winds Press, Beverly, 2020.

HEMANTH D. J. Data mining technique based critical disease prediction in medical field. In: Intelligent Systems and Computer Technology. IOS Press, Amsterdam, 2020.

SHARMA S., & OSEI-BRYSON K. M. Toward an integrated knowledge discovery and data mining process model. The Knowledge Engineering Review, 2010, 25(1), 49-67.

ALAM M. Z., RAHMAN M. S., and RAHMAN M. S. A Random Forest based predictor for medical data classification using feature ranking. Informatics in Medicine Unlocked, 2019, 15, 100180.

WU C. C., YEH W. C., HSU W. D., ISLAM M. M., NGUYEN P. A., POLY T. N., WANG Y. C., YANG H. C., and LI Y. C. Prediction of fatty liver disease using machine learning algorithms. Computer Methods and Programs in Biomedicine, 2019, 170, 23-29.

KAUR P., KUMAR R., and KUMAR M. A healthcare monitoring system using random forest and internet of things (IoT). Multimedia Tools and Applications, 2019, 78(14), 19905-19916.

XU L., LIANG G., LIAO C., CHEN G. D., and CHANG C. C. K-skip-n-gram-RF: a random Forest based method for Alzheimer's disease protein identification. Frontiers in Genetics, 2019, 10, 33.

SAADOON Y. A., & ABDULAMIR R. H. Improved Random Forest Algorithm Performance for Big Data. Journal of Physics: Conference Series, 2021, 1897(1), 012071.

GUO C., ZHANG J., LIU Y., XIE Y., HAN Z., and YU J. Recursion enhanced random forest with an improved linear model (rerf-ilm) for heart disease detection on the internet of medical things platform. Institute of Electrical and Electronics Engineers Access, 2020, 8, 59247-59256.

MALAV A., KADAM K., and KAMAT P. Prediction of heart disease using k-means and artificial neural network as hybrid approach to improve accuracy. International Journal of Engineering and Technology, 2017, 9(4), 3081-3085.

COSTA W. L., FIGUEIREDO L. S., and ALVES E. T. Application of an Artificial Neural Network for Heart Disease Diagnosis. In: XXVI Brazilian Congress on Biomedical Engineering. Springer, Singapore, 2019, 753-758.

DUTTA A., BATABYAL T., BASU M., and ACTON S. T. An efficient convolutional neural network for coronary heart disease prediction. Expert Systems with Applications, 2020, 159, 113408.

MA F., SUN T., LIU L., and JING H. Detection and diagnosis of chronic kidney disease using deep learning-based heterogeneous modified artificial neural network. Future Generation Computer Systems, 2020, 111: 17-26.

TECHOPEDIA. Knowledge Discovery in Databases (KDD). 2017.

UNIVERSITY OF CALIFORNIA. Machine Learning Repository. 2019.

ABU BAKAR W. A. W., MAN M., MAN M, and ABDULLAH Z. I-Eclat: Performance enhancement of Eclat via incremental approach in frequent itemset mining. Telkomnika, 2020, 18(1), 562-570.

ABU BAKAR W. A. W., JALIL M. A., MAN M., ABDULLAH Z., and MOHD F. Postdiffset: an Eclat-like algorithm for frequent itemset mining. International Journal of Engineering & Technology, 2018, 2(28), 197-199.

JUSOH J. A., & MAN M. Modifying iEclat Algorithm for Infrequent Patterns Mining. Advanced Science Letters, 2018, 24(3), 1876-1880.

YUSOF M. K., & MAN M. Efficiency of JSON for data retrieval in big data. Indonesian Journal of Electrical Engineering and Computer Science, 2017, 1, 250-262.


  • There are currently no refbacks.