An Ensemble Machine Learning Strategy for Accurate and Timely Prediction of Heart Disease
DOI:
https://doi.org/10.59796/jcst.V16N1.2026.166Keywords:
heart disease, classification, machine learning, ensemble, bagging, boosting, stackingAbstract
Heart disease is a complicated disorder that is becoming increasingly common. Finding the best treatment plan for any heart disease patient requires early detection. The main goal of this study is to improve the prediction of heart disease by implementing ensemble learning techniques with machine learning (ML) models. The dataset used in this study, comprising 319,755 records from the Centers for Disease Control and Prevention, was downloaded from the Kaggle website. The SMOTE-ENN hybrid approach was used to address class imbalance. To increase the consistency and distribution of numerical variables, data preprocessing entailed standardization and the creation of dummy variables. The machine learning (ML) models Random Forest, Logistic Regression, Support Vector Machine, and Extra Trees were applied to the processed dataset without and with bagging, boosting, and stacking ensemble methods. Stacking, which combined SVM and Logistic Regression, outperformed the baseline models and other ensemble techniques, returning the highest recall score of 0.837731. This study underlines the significance of data balancing and ensemble learning for accurate forecasts based on medical datasets, which are typically large. The findings highlight how ML can enhance early diagnosis and intervention in the treatment of cardiac disease.
References
Ahmad, M. N. (2021). Comprehensive analysis of heart disease prediction using Scikit-Learn. International Research Journal of Modernization in Engineering Technology and Science, 3(3), 67–80.
Alowais, S. A., Alghamdi, S. S., Alsuhebany, N., Alqahtani, T., Alshaya, A. I., Almohareb, S. N., ... & Albekairy, A. M. (2023). Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Medical Education, 23(1), Article 689. https://doi.org/10.1186/s12909-023-04698-z
Badawy, M., Ramadan, N., & Hefny, H. A. (2023). Healthcare predictive analytics using machine learning and deep learning techniques: A survey. Journal of Electrical Systems and Information Technology, 10(1), Article 40. https://doi.org/10.1186/s43067-023-00108-y
Baghdadi, N. A., Farghaly Abdelaliem, S. M., Malki, A., Gad, I., Ewis, A., & Atlam, E. (2023). Advanced machine learning techniques for cardiovascular disease early detection and diagnosis. Journal of Big Data, 10(1), Article 144. https://doi.org/10.1186/s40537-023-00817-1
Behnke, L. M. (2022). The danger of underdiagnosing coronary microvascular disease in women. Journal of the American Association of Nurse Practitioners, 34(5), 780-783. https://doi.org/10.1097/JXX.0000000000000703
Bhagat, M., Sharma, A., & Agarwal, P. (2025). An efficient stacking-based ensemble technique for early heart attack prediction. Multimedia Tools and Applications, 84(30), 36351-36375. https://doi.org/10.1007/s11042-024-19293-7
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123-140. https://doi.org/10.1023/A:1018054314350
Chapakiya, I., Traisuwan, A., Chumpong, S., & Chumpong, K. (2025). Follow-up period classification of type 2 diabetes patients using data mining techniques. Journal of Health Science and Medical Research, 43(2), Article 20241083. https://doi.org/10.31584/jhsmr.20241083
Chaurasia, V., & Pal, S. (2021). Stacking-based ensemble framework and feature selection technique for the detection of breast cancer. SN Computer Science, 2(2), Article 67. https://doi.org/10.1007/s42979-021-00465-3
Das, R., & Sengur, A. (2010). Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Systems with Applications, 37(7), 5110-5115. https://doi.org/10.1016/j.eswa.2009.12.085
Das, S., Nayak, S. P., Sahoo, B., & Nayak, S. C. (2024). Machine learning in healthcare analytics: A state-of-the-art review. Archives of Computational Methods in Engineering, 31(7), 3923-3962. https://doi.org/10.1007/s11831-024-10098-3
Dissanayake, K., & Md Johar, M. G. (2021). Comparative study on heart disease prediction using feature selection techniques on classification algorithms. Applied Computational Intelligence and Soft Computing, 2021(1), Article 5581806. https://doi.org/10.1155/2021/5581806
Fida, B., Nazir, M., Naveed, N., & Akram, S. (2011). Heart disease classification ensemble optimization using genetic algorithm [Conference presentation]. 2011 IEEE 14th International Multitopic Conferenc, IEEE, Karachi, Pakistan. https://doi.org/10.1109/INMIC.2011.6151471
Franklin, B. A., Rusia, A., Haskin-Popp, C., & Tawney, A. (2021). Chronic stress, exercise and cardiovascular disease: Placing the benefits and risks of physical activity into perspective. International Journal of Environmental Research and Public Health, 18(18), Article 9922. https://doi.org/10.3390/ijerph18189922
Gabriel, J. (2024). A machine learning-based web application for heart disease prediction. Intelligent Control and Automation, 15(1), 9-27. https://doi.org/10.4236/ica.2024.151002
Gao, X. Y., Amin Ali, A., Shaban Hassan, H., & Anwar, E. M. (2021). Improving the accuracy for analyzing heart diseases prediction based on the ensemble method. Complexity, 2021(1), Article 6663455. https://doi.org/10.1155/2021/6663455
Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F. J. M., Ignatious, E., ... & De Boer, F. (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access, 9, 19304-19326. https://doi.org/10.1109/ACCESS.2021.3053759
Hasanin, T., & Khoshgoftaar, T. (2018). The effects of random undersampling with simulated class imbalance for big data [Conference presentation]. 2018 IEEE international conference on information reuse and integration (IRI), IEEE, Salt Lake City, UT, USA. https://doi.org/10.1109/IRI.2018.00018
Heidenreich, P. A., Trogdon, J. G., Khavjou, O. A., Butler, J., Dracup, K., Ezekowitz, M. D., ... & Woo, Y. J. (2011). Forecasting the future of cardiovascular disease in the United States: A policy statement from the American Heart Association. Circulation, 123(8), 933-944. https://doi.org/10.1161/CIR.0b013e31820a55f5
Jurgens, C. Y., Lee, C. S., Aycock, D. M., Masterson Creber, R., Denfeld, Q. E., DeVon, H. A., ... & American heart association council on cardiovascular and stroke nursing; council on hypertension; and stroke council. (2022). State of the science: the relevance of symptoms in cardiovascular disease and research: a scientific statement from the American Heart Association. Circulation, 146(12), e173-e184. https://doi.org/10.1161/CIR.0000000000001089
Ketu, S., & Mishra, P. K. (2022). Empirical analysis of machine learning algorithms on imbalance electrocardiogram based arrhythmia dataset for heart disease detection. Arabian Journal for Science and Engineering, 47(2), 1447-1469. https://doi.org/10.1007/s13369-021-05972-2
Latha, C. B. C., & Jeeva, S. C. (2019). Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Informatics in Medicine Unlocked, 16, Article 100203. https://doi.org/10.1016/j.imu.2019.100203
Li, J. P., Haq, A. U., Din, S. U., Khan, J., Khan, A., & Saboor, A. (2020). Heart disease identification method using ML classification in E-Healthcare. IEEE Access, 8, 107562–107582. https://doi.org/10.1109/ACCESS.2020.3001149
Mahajan, P., Uddin, S., Hajati, F., & Moni, M. A. (2023, June). Ensemble learning for disease prediction: A review. Healthcare, 11(12), Article 1808. https://doi.org/10.3390/healthcare11121808
Matloff, N. (2017). Statistical regression and classification: From linear models to ML. CRC Press.
Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 7, 81542-81554. https://doi.org/10.1109/ACCESS.2019.2923707
Muntasir Nishat, M., Faisal, F., Jahan Ratul, I., Al-Monsur, A., Ar-Rafi, A. M., Nasrullah, S. M., ... & Khan, M. R. H. (2022). A comprehensive investigation of the performances of different machine learning classifiers with SMOTE‐ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset. Scientific Programming, 2022(1), Article 3649406. https://doi.org/10.1155/2022/3649406
Natarajan, K., Vinoth Kumar, V., Mahesh, T. R., Abbas, M., Kathamuthu, N., Mohan, E., & Annand, J. R. (2024). Efficient heart disease classification through stacked ensemble with optimized firefly feature selection. International Journal of Computational Intelligence Systems, 17(1), Article 174. https://doi.org/10.1007/s44196-024-00538-0
Nissa, N., Jamwal, S., & Neshat, M. (2024). A technical comparative heart disease prediction framework using boosting ensemble techniques. Computation, 12(1), Article 15. https://doi.org/10.3390/computation12010015
Pal, G. K., & Gangwar, S. (2023). Discovery of approaches by various machine learning ensemble model and features selection method in critical heart disease diagnosis. International Research Journal on Advanced Science Hub, 5(1), 15-21. https://doi.org/10.47392/irjash.2023.003
Pope, J. H., Aufderheide, T. P., Ruthazer, R., Woolard, R. H., Feldman, J. A., Beshansky, J. R., ... & Selker, H. P. (2000). Missed diagnoses of acute cardiac ischemia in the emergency department. New England Journal of Medicine, 342(16), 1163-1170. https://doi.org/10.1056/NEJM200004203421603
Rajendra, P., & Latifi, S. (2021). Prediction of diabetes using logistic regression and ensemble techniques. Computer Methods and Programs in Biomedicine Update, 1, Article 100032. https://doi.org/10.1016/j.cmpbup.2021.100032
Restrepo Tique, M., Araque, O., & Sanchez-Echeverri, L. A. (2024). Technological advances in the diagnosis of cardiovascular disease: A public health strategy. International Journal of Environmental Research and Public Health, 21(8), Article 1083. https://doi.org/10.3390/ijerph21081083
Schölkopf, B., Burges, C., & Vapnik, V. (1996). Incorporating invariances in support vector learning machines [Conference presentation]. International conference on artificial neural networks. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-61510-5_12
Sharaff, A., & Gupta, H. (2019). Extra-tree classifier with metaheuristics approach for email classification. Advances in computer communication and computational sciences: proceedings of IC4S 2018. Singapore: Springer Singapore. https://doi.org/10.1007/978-981-13-6861-5_17
Shorewala, V. (2021). Early detection of coronary heart disease using ensemble techniques. Informatics in Medicine Unlocked, 26, Article 100655. https://doi.org/10.1016/j.imu.2021.100655
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
Soman, K. P., Loganathan, R., & Ajay, V. (2009). ML with SVM and other kernel methods. Delhi, India: PHI Learning.
Sultan, S. Q., Javaid, N., Alrajeh, N., & Aslam, M. (2025). Machine learning-based stacking ensemble model for prediction of heart disease with explainable ai and k-fold cross-validation: A symmetric approach. Symmetry, 17(2), Article 185. https://doi.org/10.3390/sym17020185
Tui-On, T., Chumpong, K., & Samart, K. (2024). Comparison of imbalanced data techniques in predicting heart disease [Conference presentation]. Proceedings of the 7th International Conference on Applied Statistics 2024 (ICAS2024) & The 14th National Conference on Applied Statistics and Information Technology, Chiang Mai, Thailand.
Van Trier, T. J., Mohammadnia, N., Snaterse, M., Peters, R. J. G., Jørstad, H. T., & Bax, W. A. (2022). Lifestyle management to prevent atherosclerotic cardiovascular disease: Evidence and challenges. Netherlands Heart Journal, 30(1), 3-14. https://doi.org/10.1007/s12471-021-01642-y
Westcott, R. J., & Tcheng, J. E. (2019). Artificial intelligence and machine learning in cardiology. Cardiovascular Interventions, 12(14), 1312-1314. https://doi.org/10.1016/j.jcin.2019.03.026
World Health Organization. (2025). Cardiovascular diseases (CVDs). Retrieved from https://www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds)
Downloads
Published
How to Cite
License
Copyright (c) 2025 Journal of Current Science and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.



