A Machine Learning Approach to Predicting Survival of Clinical AIDS Patients via Feature Selection Algorithms

Authors

  • Suejit Pechprasarn College of Biomedical Engineering, Rangsit University, Pathum Thani 12000, Thailand & Center of Excellence in AI and Supercomputing, Rangsit University, Pathum Thani 12000, Thailand https://orcid.org/0000-0001-9105-8627
  • Punn Santichanyaphon Satriwithaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Rawisara Triyaprasertporn Satriwithaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand
  • Achiraya Asawarachan Satriwithaya School, Wat Bowon Niwet, Phra Nakhon, Bangkok 10200, Thailand

DOI:

https://doi.org/10.59796/jcst.V16N1.2026.158

Keywords:

AIDS, algorithm, artificial intelligence, classification learner, feature selection, HIV, machine learning

Abstract

Numerous individuals deal with incurable illnesses, such as acquired immunodeficiency syndrome (AIDS), daily. One of the objectives of this study is to assist AIDS patients and healthcare professionals by creating selective machine-learning models in MATLAB R2024b, which could reduce the medical costs as well as the time required to examine the patients, to aid in personalizing patient treatment, and optimize healthcare management through feature selection techniques by lessening the number of attributes used in the operational model. The dataset named AIDS Clinical Trials Group Study 175 analyzed, is obtained from open-source Kaggle and consists of 23 predictors, including time, treatment indicator, age, weight, hemophilia, homosexual activity, history of IV drug use, Karnofsky score, History of non-ZDV antiretroviral therapy, ZDV history 30 days before dataset collecting period, ZDV history, antiretroviral therapy history, race, gender, symptom indicator, treatment indicator, treatment of off-treatment before 96 ± 5 weeks, CD4 count, and CD8 count. It was randomly split into a training and testing dataset in an 80:20 ratio to train 34 machine learning models and identify the best-performing model. Feature selection methods include Minimum Relevance and Maximum Relevance, Minimum Redundancy, Chi Square (c2), ANOVA, and Kruskal-Wallis to highlight the importance of each clinical feature. Here, the Boosted Tree (Tree) achieved the highest accuracy of 87.86%. Each model was then tested on the test dataset, and the results were compared with those from the previous procedure. The models also underwent feature-selection analysis to determine the significance of each predictor and the minimum number of predictors required to function efficiently. Finally, we conclude that the model with the best performance is the Linear SVM (SVM), with 85.0% accuracy and 5 or 6 predictors, including (1) time, (2) cd420, (3) karnof, (4) cd40, (5) z30, and (6) str2.

References

Battistini Garcia, S. A., Zubair, M., & Guzman, N. (2023). CD4 Cell Count and HIV. Retrieved from https://www.ncbi.nlm.nih.gov/books/NBK513289/

Bukachi, S. A., Onono, J., Onyango-Ouma, W., Onyango, T., Jeptoo, M., Yussuf, B., ... & Richards, S. (2024). Opportunities, gaps, and challenges in the implementation of the One Health approach in Kenya. One Health Cases, 2024, Article ohcs20240019. https://doi.org/10.1079/onehealthcases.2024.0019

Chanthara, C., Khattiya, J., Roytrakul, S., et al. (2025). Plasma proteomic signatures in HIV-infected individuals post–SARS-CoV-2 infection. BMC Infectious Diseases. https://doi.org/10.1186/s12879-025-12307-1

Endebu, T., Taye, G., & Deressa, W. (2025). Development of a machine learning prediction model for loss to follow-up in HIV care using routine electronic medical records in a low-resource setting. BMC Medical Informatics and Decision Making, 25(1), Article 192. https://doi.org/10.1186/s12911-025-03030-7

Gallo, R. C., & Montagnier, L. (1987). The chronology of AIDS research. Nature, 326, 435–436. https://doi.org/10.1038/326435a0

Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundacker, H., Schooley, R. T., Haubrich, R. H., ... & Merigan, T. C. (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335(15), 1081-1090. https://doi.org/10.1056/NEJM199610103351501

Kumah, E., Boakye, D. S., Boateng, R., & Agyei, E. (2023). Advancing the global fight against HIV/Aids: Strategies, barriers, and the road to eradication. Annals of Global Health, 89(1), Article 83. https://doi.org/10.5334/aogh.4277

Latt, P. M., Soe, N. N., King, A. J., Lee, D., Phillips, T. R., Xu, X., ... & Ong, J. J. (2024). Preferences for attributes of an artificial intelligence-based risk assessment tool for HIV and sexually transmitted infections: a discrete choice experiment. BMC Public Health, 24(1), Article 3236. https://doi.org/10.1186/s12889-024-20688-2

Marcus, J. L., Sewell, W. C., Balzer, L. B., & Krakower, D. S. (2020). Artificial intelligence and machine learning for HIV prevention: Emerging approaches to ending the epidemic. Current HIV/AIDS Reports, 17(3), 171-179. https://doi.org/10.1007/s11904-020-00490-6

Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, Article 100597. https://doi.org/10.1016/j.mlwa.2024.100597

May, S. B., Giordano, T. P., & Gottlieb, A. (2024). Generalizable pipeline for constructing HIV risk prediction models across electronic health record systems. Journal of the American Medical Informatics Association, 31(3), 666-673. https://doi.org/10.1093/jamia/ocad217

Motomura, K., Chen, J., & Hu, W. S. (2008). Genetic recombination between human immunodeficiency virus type 1 (HIV-1) and HIV-2, two distinct human lentiviruses. Journal of Virology, 82(4), 1923-1933. https://doi.org/10.1128/jvi.01937-07

Payagala, S., & Pozniak, A. (2024). The global burden of HIV. Clinics in Dermatology, 42(2), 119-127. https://doi.org/10.1016/j.clindermatol.2024.02.001

Pechprasarn, S., Srisaranon, N., & Yimluean, P. (2025). Optimizing diabetes prediction: an evaluation of machine learning models through strategic feature selection. Journal of Current Science and Technology, 15(1), Article 75. https://doi.org/10.59796/jcst.V15N1.2025.75

Sahoo, C. K., Sahoo, N. K., Rao, S. R. M., & Sudhakar, M. (2017). A review on prevention and treatment of AIDS. Pharmacy & Pharmacology International Journal, 5(1), Article 00108. https://doi.org/10.15406/ppij.2017.05.00108

Shi, M., Lin, J., Wei, W., Qin, Y., Meng, S., Chen, X., ... & Jiang, J. (2022). Machine learning-based in-hospital mortality prediction of HIV/AIDS patients with Talaromyces marneffei infection in Guangxi, China. PLoS Neglected Tropical Diseases, 16(5), Article e0010388. https://doi.org/10.1371/journal.pntd.0010388

Tiribelli, S., Pansoni, S., Frontoni, E., & Giovanola, B. (2024). Ethics of artificial intelligence for cultural heritage: Opportunities and challenges. IEEE Transactions on Technology and Society, 5(3), 293 – 305. https://doi.org/10.1109/TTS.2024.3432407

UNAIDS. (2024). Global HIV & AIDS statistics-Fact sheet. In The urgency of now: 2024 global AIDS update. Retrieved from https://www.unaids.org/en/resources/fact-sheet

Van Heuvel, Y., Schatz, S., Rosengarten, J. F., & Stitz, J. (2022). Infectious RNA: Human immunodeficiency virus (HIV) biology, therapeutic intervention, and the quest for a vaccine. Toxins, 14(2), Article 138. https://doi.org/10.3390/toxins14020138

Volk, J. E., Leyden, W. A., Lea, A. N., Lee, C., Donnelly, M. C., Krakower, D. S., ... & Silverberg, M. J. (2024). Using electronic health records to improve HIV preexposure prophylaxis care: A randomized trial. JAIDS Journal of Acquired Immune Deficiency Syndromes, 95(4), 362-369. https://doi.org/10.1097/QAI.0000000000003376

Wongvorachan, T., He, S., & Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, 14(1), Article 54. https://doi.org/10.3390/info14010054

Wu, X., Zhou, X., Chen, Y., Lin, Y. F., Li, Y., Fu, L., ... & Zou, H. (2024). Global, regional, and national burdens of HIV/AIDS acquired through sexual transmission 1990–2019: An observational study. Sexual Health, 21(5), Article SH24056. https://doi.org/10.1071/SH24056

Downloads

Published

2025-12-25

How to Cite

Pechprasarn, S., Santichanyaphon, P., Triyaprasertporn, R., & Asawarachan, A. (2025). A Machine Learning Approach to Predicting Survival of Clinical AIDS Patients via Feature Selection Algorithms. Journal of Current Science and Technology, 16(1), 158. https://doi.org/10.59796/jcst.V16N1.2026.158