Identification of factors in the survival rate of heart failure patients using machine learning models and principal component analysis


  • Thanawan Panyamit Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Phattharamon Sukvivatn Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Paphada Chanma Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Yejin Kim Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Premmika Premratanachai Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Suejit Pechprasarn College of Biomedical Engineering, Rangsit University, Patumthani 12000, Thailand


heart failure classification, machine learning, mortality rate prediction, smart healthcare


Heart failure (HF) and congestive heart failure (CHF) have recently been classified as a growing and widespread epidemic worldwide that significantly impacts morbidity and mortality, especially in the aged groups. This study used a publicly available clinical dataset on 299 HF patients with 12 variables potentially contributing to their mortality: age, anemia, creatinine phosphokinase, diabetes, ejection fraction, high blood pressure, platelets, serum creatinine and sodium levels, sex, smoking, and follow-up time. Several studies previously used this dataset to identify critical factors influencing patient mortality. Here, we curate the data to ensure it is unbiased, then apply principal component analysis and machine learning models to identify factors influencing crucial variables contributing to patient mortality. We investigate and compare the classification accuracy of different machine learning models, including the tree, linear discriminant, quadratic discriminant, logistic, naïve Bayes, support vector machine, nearest-neighbor ensemble, and kernel models. We found the ensemble bagged tree model to have the highest cross-validation classification accuracy of 96.4% and require only three variables: platelets, creatinine phosphokinase, and follow-up period.


Biagetti, G., Crippa, P., Falaschetti, L., Orcioni, S., & Turchetti, C. (2016). Multivariate direction scoring for dimensionality reduction in classification problems. In International Conference on Intelligent Decision Technologies (pp. 413-423). Springer, Cham.

Biagetti, G., Crippa, P., Falaschetti, L., Luzzi, S., & Turchetti, C. (2021). Classification of Alzheimer's Disease from EEG Signal Using Robust-PCA Feature Extraction. Procedia Computer Science, 192, 3114-3122.

Chicco, D., & Jurman, G. (2020). Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC medical informatics and decision making, 20(1), 1-16.

Fonseca, C. (2006). Diagnosis of heart failure in primary care. Heart failure reviews, 11(2), 95-107.

Groenewegen, A., Rutten, F. H., Mosterd, A., & Hoes, A. W. (2020). Epidemiology of heart failure.European journal of heart failure, 22(8), 1342-1356.

Gianfelici, F., Turchetti, C., & Crippa, P. (2009). A non-probabilistic recognizer of stochastic signals based on KLT. Signal Processing, 89(4), 422-437.

Henkel, D. M., Redfield, M. M., Weston, S. A., Gerber, Y., & Roger, V. L. (2008). Death in heart failure: a community perspective. Circulation: Heart Failure, 1(2), 91-97.

Ishaq, A., Sadiq, S., Umer, M., Ullah, S., Mirjalili, S., Rupapara, V., & Nappi, M. (2021). Improving the prediction of heart failure patients' survival using SMOTE and effective data mining techniques. IEEE access, 9, 39707-39716.

Lehrke, M., & Marx, N. (2017). Diabetes mellitus and heart failure. The American journal of cardiology, 120(1), S37-S47.

Onan, A., Korukoğlu, S., & Bulut, H. (2016). Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications, 57, 232-247.

Onan, A., & Korukoğlu, S. (2017). A feature selection model based on genetic rank aggregation for text sentiment classification. Journal of Information Science, 43(1), 25-38.

Onan, A., Korukoğlu, S., & Bulut, H. (2017). A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary algorithm for sentiment classification. Information Processing & Management, 53(4), 814-833.

Onan, A. (2018). An ensemble scheme based on language function analysis and feature engineering for text genre classification. Journal of Information Science, 44(1), 28-47.

Onan, A. (2019). Two-stage topic extraction model for bibliometric data analysis based on word embeddings and clustering. IEEE Access, 7, 145614-145633.

Onan, A. (2020). Mining opinions from instructor evaluation reviews: a deep learning approach. Computer Applications in Engineering Education, 28(1), 117-138.

Onan, A. (2021). Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurrency and Computation: Practice and Experience, 33(23), e5909.

Onan, A. (2022). Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. Journal of King Saud University-Computer and Information Sciences, 34(5), 2098-2117.

Osenenko, K. M., Kuti, E., Deighton, A. M., Pimple, P., & Szabo, S. M. (2022). Burden of hospitalization for heart failure in the United States: a systematic literature review. Journal of Managed Care & Specialty Pharmacy, 28(2), 157-167.

Prochaska, J. O., Velicer, W. F., Rossi, J. S., Redding, C. A., Greene, G. W., Rossi, S. R., ... & Plummer, B. A. (2004). Multiple risk expert systems interventions: impact of simultaneous stage-matched expert system interventions for smoking, high-fat diet, and sun exposure in a population of parents. Health Psychology, 23(5), 503.

Seferović, P. M., Vardas, P., Jankowska, E. A., Maggioni, A. P., Timmis, A., Milinković, I., ... & Voronkov, L. (2021). The heart failure association atlas: heart failure epidemiology and management statistics 2019. European Journal of Heart Failure, 23(6), 906-914.

Tankumpuan, T., Asano, R., Koirala, B., Dennison-Himmelfarb, C., Sindhu, S., & Davidson, P. M. (2019). Heart failure and social determinants of health in Thailand: An integrative review. Heliyon, 5(5), e01658.

Taylor, C. J., Ordóñez-Mena, J. M., Roalfe, A. K., Lay-Flurrie, S., Jones, N. R., Marshall, T., & Hobbs, F. R. (2019). Trends in survival after a diagnosis of heart failure in the United Kingdom 2000-2017: population based cohort study. bmj, 364.

Van Riet, E. E., Hoes, A. W., Limburg, A., Landman, M. A., van der Hoeven, H., & Rutten, F. H. (2014). Prevalence of unrecognized heart failure in older persons with shortness of breath on exertion. European journal of heart failure, 16(7), 772-777




How to Cite

Panyamit, T. ., Sukvivatn, P. ., Chanma, P. ., Kim, Y. ., Premratanachai, P. ., & Pechprasarn, S. . (2023). Identification of factors in the survival rate of heart failure patients using machine learning models and principal component analysis. Journal of Current Science and Technology, 12(2), 336–348. Retrieved from



Research Article