A Machine Learning Approach to Predicting Survival of Clinical AIDS Patients via Feature Selection Algorithms
DOI:
https://doi.org/10.59796/jcst.V16N1.2026.158Keywords:
AIDS, algorithm, artificial intelligence, classification learner, feature selection, HIV, machine learningAbstract
Numerous individuals deal with incurable illnesses, such as acquired immunodeficiency syndrome (AIDS), daily. One of the objectives of this study is to assist AIDS patients and healthcare professionals by creating selective machine-learning models in MATLAB R2024b, which could reduce the medical costs as well as the time required to examine the patients, to aid in personalizing patient treatment, and optimize healthcare management through feature selection techniques by lessening the number of attributes used in the operational model. The dataset named AIDS Clinical Trials Group Study 175 analyzed, is obtained from open-source Kaggle and consists of 23 predictors, including time, treatment indicator, age, weight, hemophilia, homosexual activity, history of IV drug use, Karnofsky score, History of non-ZDV antiretroviral therapy, ZDV history 30 days before dataset collecting period, ZDV history, antiretroviral therapy history, race, gender, symptom indicator, treatment indicator, treatment of off-treatment before 96 ± 5 weeks, CD4 count, and CD8 count. It was randomly split into a training and testing dataset in an 80:20 ratio to train 34 machine learning models and identify the best-performing model. Feature selection methods include Minimum Relevance and Maximum Relevance, Minimum Redundancy, Chi Square (c2), ANOVA, and Kruskal-Wallis to highlight the importance of each clinical feature. Here, the Boosted Tree (Tree) achieved the highest accuracy of 87.86%. Each model was then tested on the test dataset, and the results were compared with those from the previous procedure. The models also underwent feature-selection analysis to determine the significance of each predictor and the minimum number of predictors required to function efficiently. Finally, we conclude that the model with the best performance is the Linear SVM (SVM), with 85.0% accuracy and 5 or 6 predictors, including (1) time, (2) cd420, (3) karnof, (4) cd40, (5) z30, and (6) str2.
References
Battistini Garcia, S. A., Zubair, M., & Guzman, N. (2023). CD4 Cell Count and HIV. Retrieved from https://www.ncbi.nlm.nih.gov/books/NBK513289/
Bukachi, S. A., Onono, J., Onyango-Ouma, W., Onyango, T., Jeptoo, M., Yussuf, B., ... & Richards, S. (2024). Opportunities, gaps, and challenges in the implementation of the One Health approach in Kenya. One Health Cases, 2024, Article ohcs20240019. https://doi.org/10.1079/onehealthcases.2024.0019
Chanthara, C., Khattiya, J., Roytrakul, S., et al. (2025). Plasma proteomic signatures in HIV-infected individuals post–SARS-CoV-2 infection. BMC Infectious Diseases. https://doi.org/10.1186/s12879-025-12307-1
Endebu, T., Taye, G., & Deressa, W. (2025). Development of a machine learning prediction model for loss to follow-up in HIV care using routine electronic medical records in a low-resource setting. BMC Medical Informatics and Decision Making, 25(1), Article 192. https://doi.org/10.1186/s12911-025-03030-7
Gallo, R. C., & Montagnier, L. (1987). The chronology of AIDS research. Nature, 326, 435–436. https://doi.org/10.1038/326435a0
Hammer, S. M., Katzenstein, D. A., Hughes, M. D., Gundacker, H., Schooley, R. T., Haubrich, R. H., ... & Merigan, T. C. (1996). A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. New England Journal of Medicine, 335(15), 1081-1090. https://doi.org/10.1056/NEJM199610103351501
Kumah, E., Boakye, D. S., Boateng, R., & Agyei, E. (2023). Advancing the global fight against HIV/Aids: Strategies, barriers, and the road to eradication. Annals of Global Health, 89(1), Article 83. https://doi.org/10.5334/aogh.4277
Latt, P. M., Soe, N. N., King, A. J., Lee, D., Phillips, T. R., Xu, X., ... & Ong, J. J. (2024). Preferences for attributes of an artificial intelligence-based risk assessment tool for HIV and sexually transmitted infections: a discrete choice experiment. BMC Public Health, 24(1), Article 3236. https://doi.org/10.1186/s12889-024-20688-2
Marcus, J. L., Sewell, W. C., Balzer, L. B., & Krakower, D. S. (2020). Artificial intelligence and machine learning for HIV prevention: Emerging approaches to ending the epidemic. Current HIV/AIDS Reports, 17(3), 171-179. https://doi.org/10.1007/s11904-020-00490-6
Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, Article 100597. https://doi.org/10.1016/j.mlwa.2024.100597
May, S. B., Giordano, T. P., & Gottlieb, A. (2024). Generalizable pipeline for constructing HIV risk prediction models across electronic health record systems. Journal of the American Medical Informatics Association, 31(3), 666-673. https://doi.org/10.1093/jamia/ocad217
Motomura, K., Chen, J., & Hu, W. S. (2008). Genetic recombination between human immunodeficiency virus type 1 (HIV-1) and HIV-2, two distinct human lentiviruses. Journal of Virology, 82(4), 1923-1933. https://doi.org/10.1128/jvi.01937-07
Payagala, S., & Pozniak, A. (2024). The global burden of HIV. Clinics in Dermatology, 42(2), 119-127. https://doi.org/10.1016/j.clindermatol.2024.02.001
Pechprasarn, S., Srisaranon, N., & Yimluean, P. (2025). Optimizing diabetes prediction: an evaluation of machine learning models through strategic feature selection. Journal of Current Science and Technology, 15(1), Article 75. https://doi.org/10.59796/jcst.V15N1.2025.75
Sahoo, C. K., Sahoo, N. K., Rao, S. R. M., & Sudhakar, M. (2017). A review on prevention and treatment of AIDS. Pharmacy & Pharmacology International Journal, 5(1), Article 00108. https://doi.org/10.15406/ppij.2017.05.00108
Shi, M., Lin, J., Wei, W., Qin, Y., Meng, S., Chen, X., ... & Jiang, J. (2022). Machine learning-based in-hospital mortality prediction of HIV/AIDS patients with Talaromyces marneffei infection in Guangxi, China. PLoS Neglected Tropical Diseases, 16(5), Article e0010388. https://doi.org/10.1371/journal.pntd.0010388
Tiribelli, S., Pansoni, S., Frontoni, E., & Giovanola, B. (2024). Ethics of artificial intelligence for cultural heritage: Opportunities and challenges. IEEE Transactions on Technology and Society, 5(3), 293 – 305. https://doi.org/10.1109/TTS.2024.3432407
UNAIDS. (2024). Global HIV & AIDS statistics-Fact sheet. In The urgency of now: 2024 global AIDS update. Retrieved from https://www.unaids.org/en/resources/fact-sheet
Van Heuvel, Y., Schatz, S., Rosengarten, J. F., & Stitz, J. (2022). Infectious RNA: Human immunodeficiency virus (HIV) biology, therapeutic intervention, and the quest for a vaccine. Toxins, 14(2), Article 138. https://doi.org/10.3390/toxins14020138
Volk, J. E., Leyden, W. A., Lea, A. N., Lee, C., Donnelly, M. C., Krakower, D. S., ... & Silverberg, M. J. (2024). Using electronic health records to improve HIV preexposure prophylaxis care: A randomized trial. JAIDS Journal of Acquired Immune Deficiency Syndromes, 95(4), 362-369. https://doi.org/10.1097/QAI.0000000000003376
Wongvorachan, T., He, S., & Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, 14(1), Article 54. https://doi.org/10.3390/info14010054
Wu, X., Zhou, X., Chen, Y., Lin, Y. F., Li, Y., Fu, L., ... & Zou, H. (2024). Global, regional, and national burdens of HIV/AIDS acquired through sexual transmission 1990–2019: An observational study. Sexual Health, 21(5), Article SH24056. https://doi.org/10.1071/SH24056
Downloads
Published
How to Cite
Issue
Section
Categories
- Computing (Computer Science; Computer Engineering) > Artificial Intelligence (AI)
- Computing (Computer Science; Computer Engineering) > Bioinformatics
- Computing (Computer Science; Computer Engineering) > Data Science and Analytics
- Computing (Computer Science; Computer Engineering) > Machine Learning and Intelligent Systems
License
Copyright (c) 2025 Journal of Current Science and Technology

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.



