Identification of factors in the survival rate of heart failure patients using machine learning models and principal component analysis


  • Thanawan Panyamit Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Phattharamon Sukvivatn Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Paphada Chanma Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Yejin Kim Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Premmika Premratanachai Satriwitthaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Suejit Pechprasarn College of Biomedical Engineering, Rangsit University, Patumthani 12000, Thailand


heart failure classification, machine learning, mortality rate prediction, smart healthcare


Heart failure (HF) and congestive heart failure (CHF) have recently been classified as a growing and widespread epidemic worldwide that significantly impacts morbidity and mortality, especially in the aged groups. This study used a publicly available clinical dataset on 299 HF patients with 12 variables potentially contributing to their mortality: age, anemia, creatinine phosphokinase, diabetes, ejection fraction, high blood pressure, platelets, serum creatinine and sodium levels, sex, smoking, and follow-up time. Several studies previously used this dataset to identify critical factors influencing patient mortality. Here, we curate the data to ensure it is unbiased, then apply principal component analysis and machine learning models to identify factors influencing crucial variables contributing to patient mortality. We investigate and compare the classification accuracy of different machine learning models, including the tree, linear discriminant, quadratic discriminant, logistic, naïve Bayes, support vector machine, nearest-neighbor ensemble, and kernel models. We found the ensemble bagged tree model to have the highest cross-validation classification accuracy of 96.4% and require only three variables: platelets, creatinine phosphokinase, and follow-up period.


Research Article