Predicting Thyroid Cancer Recurrence Using Machine Learning: An Artificial Intelligence Approach to Clinical Oncology

Main Article Content

  Joy Aifuobhokhan
  Ahmad Khalid Hussain
  Chijioke Cyriacus Ekechi
  Aisha Olasunbo Olanrewaju
  Emmanuel Afuadajo
  Deborah Adetola Bowale
  Oluwadare Marvellous Inioluwa

Abstract

Background of study: Differentiated thyroid cancer (DTC) accounts for most thyroid malignancies and has favorable survival outcomes, yet up to 30% of patients experience recurrence, placing strain on follow-up systems in resource-limited settings. Conventional staging tools offer limited predictive precision. With increasing interest in machine learning (ML) for precision oncology, there is a need for interpretable, deployable models suitable for low-resource environments.
Aims and scope of paper: To develop and validate an interpretable machine learning model for predicting thyroid cancer recurrence and assess its feasibility for deployment in constrained clinical settings, including African oncology contexts.
Methods: A retrospective dataset of 383 DTC patients with at least 10-year follow-up was sourced from the UCI Machine Learning Repository. Thirteen demographic, clinical, and treatment-related predictors were included. Data preprocessing involved encoding, scaling, and class balancing using SMOTE. Logistic Regression, Random Forest, K-Nearest Neighbors, and Extreme Gradient Boosting (XGBoost) were trained with hyperparameter tuning via grid search and cross-validation. Performance was evaluated using accuracy, precision, recall, F1 score, and AUC-ROC.
Result: XGBoost achieved the best performance with 97% accuracy, 95% recall, 94% precision, and an AUC-ROC of 0.93. The most influential predictors were age, smoking status, T and M staging, ATA risk category, and adenopathy. The final model was deployed as a browser-based decision support tool to enable real-time recurrence risk estimation.
Conclusion: This study presents a high-performing and interpretable ML model for predicting DTC recurrence, demonstrating feasibility for use in low-resource oncology settings. External validation with African clinical datasets and integration into electronic health systems is recommended to enhance equity and clinical uptake.

Article Details

How to Cite
Aifuobhokhan, J., Hussain, A. K., Ekechi, C. C., Olanrewaju, A. O., Afuadajo, E., Bowale, D. A., & Inioluwa, O. M. (2025). Predicting Thyroid Cancer Recurrence Using Machine Learning: An Artificial Intelligence Approach to Clinical Oncology. International Journal of Advances in Artificial Intelligence and Machine Learning, 2(3), 135–148. https://doi.org/10.58723/ijaaiml.v2i3.469
Section
Articles

References

Adedinsewo, D. A., Onietan, D., Morales-Lara, A. C., Moideen Sheriff, S., Afolabi, B. B., Kushimo, O. A., Mbakwem, A. C., Ibiyemi, K. F., Ogunmodede, J. A., Raji, H. O., Ringim, S. H., Habib, A. A., Hamza, S. M., Ogah, O. S., Obajimi, G., Saanu, O. O., Aborisade, S., Jagun, O. E., Inofomoh, F. O., … Carter, R. E. (2025). Contextual challenges in implementing artificial intelligence for healthcare in low-resource environments: insights from the SPEC-AI Nigeria trial. Frontiers in Cardiovascular Medicine, 12(March), 1–9. https://doi.org/10.3389/fcvm.2025.1516088

Ahmad, M. A. S., & Haddad, J. (2024). An Explainable AI Model for Predicting the Recurrence of Differentiated Thyroid Cancer. Second Jordanian International Biomedical Engineering Conference (JIBEC), 84–89. https://doi.org/10.1109/JIBEC63210.2024.10932125

Alawiyah, T., Wibisono, T., & Mulyani, Y. S. (2024). Journal of Computer Networks , Architecture and High Performance Computing The Prediction of Thyroid Cancer Recurrence with the XGBoost Method : The Clinicopathological Feature-Based Approach Journal of Computer Networks , Architecture and High Performanc. Journal of Computer Networks, Architecture and High Performance Computing, 6(3), 1035–1045. https://doi.org/10.47709/cnahpc.v6i3.4101

Borzooei, S., Briganti, G., Golparian, M., Lechien, J. R., & Tarokhin, A. (2024). Machine learning for risk stratification of thyroid cancer patients: a 15-year cohort study. European Archives of Oto Rhino Laryngology, 280, 2095–2104. https://doi.org/10.1007/s00405-023-08299-w

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785

Chu, C. S., Lee, N. P., Adeoye, J., Thomson, P., & Choi, S. W. (2020). Machine learning and treatment outcome prediction for oral cancer. Journal of Oral Pathology and Medicine, 49(10), 977–985. https://doi.org/10.1111/jop.13089

Gomes Mantovani, R., Horváth, T., Rossi, A. L. D., Cerri, R., Barbon Junior, S., Vanschoren, J., & Carvalho, A. C. P. L. F. d. (2024). Better trees: an empirical study on hyperparameter tuning of classification decision tree induction algorithms. Data Mining and Knowledge Discovery, 38(3), 1364–1416. https://doi.org/10.1007/s10618-024-01002-5

Gordon, A. J., Dublin, J. C., Patel, E., Papazian, M., Chow, M. S., Persky, M. J., Jacobson, A. S., Patel, K. N., Suh, I., Morris, L. G. T., & Givi, B. (2022). American Thyroid Association Guidelines and National Trends in Management of Papillary Thyroid Carcinoma. JAMA Otolaryngology - Head and Neck Surgery, 148(12), 1156–1163. https://doi.org/10.1001/jamaoto.2022.3360

Habchi, Y., Himeur, Y., Kheddar, H., Boukabou, A., Atalla, S., Chouchane, A., Ouamane, A., & Mansoor, W. (2023). AI in Thyroid Cancer Diagnosis: Techniques, Trends, and Future Directions. Systems, 11(10), 1–33. https://doi.org/10.3390/systems11100519

Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1). https://doi.org/10.1186/s40537-024-00973-y

Haugen, B. R., Alexander, E. K., Bible, K. C., Doherty, G. M., Mandel, S. J., Nikiforov, Y. E., Pacini, F., Randolph, G. W., Sawka, A. M., Schlumberger, M., Schuff, K. G., Sherman, S. I., Sosa, J. A., Steward, D. L., Tuttle, R. M., & Wartofsky, L. (2016). 2015 American Thyroid Association Management Guidelines for Adult Patients with Thyroid Nodules and Differentiated Thyroid Cancer: The American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid, 26(1), 1–133. https://doi.org/10.1089/thy.2015.0020

Kim, S. Y., Kim, Y. Il, Kim, H. J., Chang, H., Kim, S. M., Lee, Y. S., Kwon, S. S., Shin, H., Chang, H. S., Park, C. S., & Moorthy, B. T. (2021). New approach of prediction of recurrence in thyroid cancer patients using machine learning. Medicine (United States), 100(42), E27493. https://doi.org/10.1097/MD.0000000000027493

Li, Z., Wang, N., Li, X., Xie, Y., Dou, Z., Xin, H., Lin, Y., Si, Y., Feng, T., & Wang, G. (2025). Thyroid cancer: From molecular insights to therapy (Review). Oncology Letters, 30(5). https://doi.org/10.3892/ol.2025.15266

Lickert, H., Wewer, A., Dittmann, S., Bilge, P., & Dietrich, F. (2020). Selection of Suitable Machine Learning Algorithms for Classification Tasks in Reverse Logistics. Procedia CIRP, 96(March), 272–277. https://doi.org/10.1016/j.procir.2021.01.086

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 4766–4775. https://doi.org/10.48550/arXiv.1705.07874

Mahamadou, A. J. D., Ochasi, A., & Altman, R. B. (2024). Data Ethics in the Era of Healthcare Artificial Intelligence in Africa: An Ubuntu Philosophy Perspective. ArXiv Preprint ArXiv:2406.10121. https://doi.org/10.48550/arXiv.2406.10121

Mao, Y., Huang, Y., Xu, L., Liang, J., Lin, W., Huang, H., Li, L., Wen, J., & Chen, G. (2022). Surgical Methods and Social Factors Are Associated With Long-Term Survival in Follicular Thyroid Carcinoma: Construction and Validation of a Prognostic Model Based on Machine Learning Algorithms. Frontiers in Oncology, 12(June), 1–17. https://doi.org/10.3389/fonc.2022.816427

Nettore, I. C., Colao, A., & Macchia, P. E. (2018). Nutritional and environmental factors in thyroid carcinogenesis. International Journal of Environmental Research and Public Health, 15(8). https://doi.org/10.3390/ijerph15081735

Park, Y. M., & Lee, B. J. (2021). Machine learning-based prediction model using clinico-pathologic factors for papillary thyroid carcinoma recurrence. Scientific Reports, 11(1), 1–7. https://doi.org/10.1038/s41598-021-84504-2

Probst, P., Wright, M. N., & Boulesteix, A. L. (2019). Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), 1–19. https://doi.org/10.1002/widm.1301

Putatunda, S., & Rama, K. (2018). A comparative analysis of hyperopt as against other approaches for hyper-parameter optimization of XGBoost. ACM International Conference Proceeding Series, 6–10. https://doi.org/10.1145/3297067.3297080

R, K., & E, I. (2021). Hyperparameter tuning of AdaBoost algorithm for social spammer identification. International Journal of Pervasive Computing and Communications, 5(17), 462–482. https://doi.org/10.1108/IJPCC-09-2020-0130

Sankar, S., & Sathyalakshmi, S. (2024). A Study on the Explainability of Thyroid Cancer Prediction: SHAP Values and Association-Rule Based Feature Integration Framework. Computers, Materials and Continua, 79(2), 3111–3138. https://doi.org/10.32604/cmc.2024.048408

Sarker, I. H. (2021). Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 1–21. https://doi.org/10.1007/s42979-021-00592-x

Tang, J., Zhanghuang, C., Yao, Z., Li, L., Xie, Y., Tang, H., Zhang, K., Wu, C., Yang, Z., & Yan, B. (2023). Development and validation of a nomogram to predict cancer-specific survival in middle-aged patients with papillary thyroid cancer: A SEER database study. Heliyon, 9(2). https://doi.org/10.1016/j.heliyon.2023.e13665

Wang, H., Zhang, C., Li, Q., Tian, T., Huang, R., Qiu, J., & Tian, R. (2024). Development and validation of prediction models for papillary thyroid cancer structural recurrence using machine learning approaches. BMC Cancer, 24(1), 1–12. https://doi.org/10.1186/s12885-024-12146-4

Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295–316. https://doi.org/10.1016/j.neucom.2020.07.061

Yousefi, M., Maleki, S. F., Jafarizadeh, A., Youshanlui, M. A., Jafari, A., Pedrammehr, S., Alizadehsani, R., Tadeusiewicz, R., & Pławiak, P. (2024). Advancements in Radiomics and Artificial Intelligence for Thyroid Cancer Diagnosis. ArXiv, 1(39). https://doi.org/10.48550/arXiv.2404.07239

Zimmermann, M. B., & Boelaert, K. (2015). Iodine deficiency and thyroid disorders. The Lancet Diabetes and Endocrinology, 3(4), 286–295. https://doi.org/10.1016/S2213-8587(14)70225-6