Optimizing Chronic Kidney Disease Prediction: A Machine Learning Approach with Minimal Diagnostic Predictors
DOI:
https://doi.org/10.59796/jcst.V15N1.2025.76Keywords:
chronic kidney disease classification, chronic kidney disease, machine learning, feature selection methods, artificial intelligenceAbstract
Chronic kidney disease (CKD) is a major public health issue that necessitates accurate diagnostic methods for effective management. This study involved training an open-source clinical dataset of 200 patients from Enam Medical College, comprising 28 clinical features, obtained from the UCI machine learning repository. After preprocessing to ensure a balanced dataset for objectivity, the data was split into training and testing sets in an 80:20 ratio. The research trained 22 machine learning models, including Naïve Bayes, decision trees, support vector machines (SVM), logistic regression, ensemble methods, kernel models, and neural networks. These models were evaluated using several metrics-accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic (ROC) curve-computed through 5-fold cross-validation to assess their performance and ensure they were not overfitting or underfitting. The best-performing model was the Kernel Naïve Bayes, achieving a 96.55% accuracy, 95% precision, 98.28% recall, and 96.61% F1-score on the training dataset. For the test dataset, it showed a slight performance drop but remained robust with 92.86% accuracy, 87.50% precision, 100% recall, and 93.33% F1-score. Furthermore, feature selection techniques such as minimum-redundancy-maximum-relevance, Chi2, ANOVA, and Kruskal-Wallis tests were used to determine the most significant predictors. It was found that only four features-packed cell value, stages of glomerular filtration rate, specific gravity of urine, and albumin content in urine-were necessary for maintaining similar model performance. This systematic approach not only highlighted critical clinical features but also helped in simplifying the model complexity, which could benefit broader medical applications like lung cancer screening by reducing screen time, resources, and medical costs.
References
Ashafuddula, N. I., Islam, B., & Islam, R. (2023). An Intelligent Diagnostic System to Analyze Early-Stage Chronic Kidney Disease for Clinical Application. Applied Computational Intelligence and Soft Computing, 2023, Article 3140270. https://doi.org/10.1155/2023/3140270
Dubey, Y., Mange, P., Barapatre, Y., Sable, B., Palsodkar, P., & Umate, R. (2023). Unlocking Precision Medicine for Prognosis of Chronic Kidney Disease Using Machine Learning. Diagnostics, 13(19), Article 3151, 1-22. https://doi.org/10.3390/diagnostics13193151
Durga, P., Karthikeyan, S. (2023). Comparative analysis for augmented decision-making applications using deep learning models. Journal of Current Science and Technology, 13(3), 791-803 https://doi.org/10.59796/jcst.V13N3.2023.2273
Drawz, P., & Rahman, M. (2015). Chronic kidney disease. Annals of Internal Medicine, 162(11), ITC1-ITC16. https://doi.org/10.7326/AITC201506020
Dritsas, E., & Trigka, M. (2022). Machine learning techniques for chronic kidney disease risk prediction. Big Data and Cognitive Computing, 6(3), Article 98. https://doi.org/10.3390/bdcc6030098
George, C., Echouffo-Tcheugui, J. B., Jaar, B. G., Okpechi, I. G., & Kengne, A. P. (2022). The need for screening, early diagnosis, and prediction of chronic kidney disease in people with diabetes in low-and middle-income countries-a review of the current literature. BMC Medicine, 20(1), 1-12. https://doi.org/10.1186/s12916-022-02438-6
Haratian, A., Maleki, Z., Shayegh, F., & Safaeian, A. (2022). Detection of factors affecting kidney function using machine learning methods. Scientific Reports, 12(1), Article 21740. https://doi.org/10.1038/s41598-022-26160-8
Iftikhar, H., Khan, M., Khan, Z., Khan, F., Alshanbari, H., & Ahmad, Z. (2023). A Comparative Analysis of Machine Learning Models: A Case Study in Predicting Chronic Kidney Disease. Sustainability. 15(3), Article 2754. https://doi.org/10.3390/su15032754.
Islam, M. A., Akter, S., Hossen, M. S., Keya, S. A., Tisha, S. A., & Hossain, S. (2020, December 3-4). Risk factor prediction of chronic kidney disease based on machine learning algorithms [Conference presentation]. Paper presented at the 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India. https://doi.org/10.1109/ICISS49785.2020.9315878
Kalantar-Zadeh, K., Jafar, T. H., Nitsch, D., Neuen, B. L., & Perkovic, V. (2021). Chronic kidney disease. Lancet, 398(10302), 786-802. https://doi.org/10.1016/s0140-6736(21)00519-5
Khwanchum, R., Pothiban, L., Wonghongkul, T., Lirtmulikaporn, S. (2024). Effectiveness of the Nurse-led Self and Family Management Support Program among Adults with Early-stage Chronic Kidney Disease: A Randomized Controlled Trial. Pacific Rim International Journal of Nursing Research, 28(1), 219-33. Available from: https://he02.tci-thaijo.org/index.php/PRIJNR/article/view/264735
Kumari, S., & Singh, S. (2022, November 4-5). An ensemble learning-based model for effective chronic kidney disease prediction [Conference presentation]. 2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India. https://doi.org/10.1109/ICCCIS56430.2022.10037698
Levey, A. S., Coresh, J., Balk, E., Kausz, A. T., Levin, A., Steffes, M. W., ... & Eknoyan, G. (2003). National Kidney Foundation practice guidelines for chronic kidney disease: evaluation, classification, and stratification. Annals of Internal Medicine, 139(2), 137-147. https://doi.org/10.7326/0003-4819-139-2-200307150-00013
Lv, J.-C., & Zhang, L.-X. (2019). Prevalence and disease burden of chronic kidney disease. Renal Fibrosis: Mechanisms and Therapies, 1165, 3-15. https://doi.org/10.1007/978-981-13-8871-2_1
Masko, D., & Hensman, P. (2015). The impact of imbalanced training data for convolutional neural networks. In KTH Royal Institute of Technology CSC School. https://www.kth.se/social/files/588617ebf2765401cfcc478c/PHensmanDMasko_dkand15.pdf
Murphy, D., McCulloch, C. E., Lin, F., Banerjee, T., Bragg-Gresham, J. L., Eberhardt, M. S., ... & Centers for Disease Control and Prevention Chronic Kidney Disease Surveillance Team. (2016). Trends in prevalence of chronic kidney disease in the United States. Annals of Internal Medicine, 165(7), 473-481. https://doi.org/10.7326/m16-0273
Nishat, M., Faisal, F., Dip, R., Nasrullah, S., Ahsan, R., Shikder, F., Asif, M., & Hoque, M. (2018). A Comprehensive Analysis on Detecting Chronic Kidney Disease by Employing Machine Learning Algorithms. EAI Endorsed Trans. Pervasive Health Technol, 21(29), Article e1. https://doi.org/10.4108/eai.13-8-2021.170671.
Pechprasarn, S., Manavibool, L., Supmool, N., Vechpanich, N., & Meepadung, P. (2023). Predicting Parkinson's Disease Severity using Telemonitoring Data and Machine Learning Models: A Principal Component Analysis-based Approach for Remote Healthcare Services during COVID-19 Pandemic. Journal of Current Science and Technology, 13(2), 465-485. https://ph04.tci-thaijo.org/index.php/JCST/article/view/694
Pradeepa, P., & Jeyakumar, M. K. (2022). Data redundancy removal using K-MAD based self-tuning spectral clustering and CKD prediction using ML techniques. Journal of Current Science and Technology, 12(3), 517-537. https://ph04.tci-thaijo.org/index.php/JCST/article/view/291
Rainey, H. (2019). Preventing complications and managing symptoms of CKD. Practice Nursing, 30(6), 276-281. https://doi.org/10.12968/pnur.2019.30.6.276
Sawhney, R., Malik, A., Sharma, S., & Narayan, V. (2023). A comparative assessment of artificial intelligence models used for early prediction and evaluation of chronic kidney disease. Decision Analytics Journal, 6, Article 100169. https://doi.org/10.1016/j.dajour.2023.100169
Downloads
Published
How to Cite
Issue
Section
Categories
License
Copyright (c) 2024 Journal of Current Science and Technology
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.