Date of Award
2025
Document Type
Thesis
Degree Name
Master of Science in Artificial Intelligence
Department
Computer Science
Committee Chair and Members
Reda Nacif Elalaoui, Chair
Abla Bedoui
Debarshi Ghosh
Keywords
Evolutionary feature selection, FNIH OA biomarker consortium, Knee osteoarthritis progression, Machine learning, Multimodal regression models, Predictive modeling in KOA
Abstract
Knee Osteoarthritis (KOA) is a progressive musculoskeletal disease involving cartilage matrix degradation, subchondral bone remodeling, and systemic inflammation, significantly impairing joint function and mobility. Existing KOA prediction models are not designed to account for nonlinear multimodal biomarker interactions or to integrate biochemical and imaging data, thus limiting their clinical utility. The current method for early detection and prediction of KOA disease progression is primarily based on machine learning-based approaches using radiographic imaging data, static feature selection, and deterministic outputs. These machine learning approaches often fail to capture the pathophysiology of KOA disease progression, which involves a complex cascade of processes, starting from biochemical changes to cartilage degradation. Thus, the predictive modelling framework for the early detection of KOA requires an understanding of the biochemical markers or biomarkers that lead to disease progression. Biomarkers such as CTX-I (C-telopeptide of crosslinked collagen type I), NTX-1 (N-telopeptides of type I collagen), hsCRP (high-sensitivity C-reactive protein), C1M (Type I Collagen Degradation), and C2M (Type II Collagen Degradation) are being extensively studied as mechanistically relevant markers of KOA disease initiation or progression. However, these biomarkers lack integration into the predictive modelling framework because of the limited availability of patient-specific measurements.
This study presents a multimodal regression-based framework that uses the FNIH OA Biomarker Consortium dataset. The study involved a regression-based framework integrating three domain-specific datasets: Clinical + Bone Biomechanical, Clinical + Cartilage Imaging and Clinical + Bone Biochemical. Each dataset was modelled with a biologically validated progression target: ΔCTX-II (Urinary C-Terminal Telopeptide of Type II Collagen) for biochemical degradation, ΔWMTMTH (medial tibiofemoral cartilage thickness) for MRI-based tissue loss, and ΔMTCT (medial tibial cartilage thickness) for morphometric changes. Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) were employed to identify predictive biomarkers in high-dimensional, nonlinear spaces.
Among the five regression models evaluated (Random Forest, Bayesian Ridge, Linear Regression, Quantile Regression, and Evolutionary Regression), Random Forest model optimized with Particle Swarm Optimization, achieved the highest accuracy for the Clinical + Biochemical dataset (R² = 0.977, RMSE = 0.2709), based on the features N-Telopeptide of Type I Collagen (NTX-I) and X-ray Joint Space Measurement (XRJSM). In the Clinical + Cartilage Imaging dataset, Bayesian Ridge performed best (R² = 0.9441, RMSE = 0.0205) using Whole Medial Tibial Cartilage Volume (WMTVCL) and Estimated Medial Tibial Mean Thickness (EMTMTH). For the Clinical + Bone Biomechanical dataset, Quantile Regression achieved the best performance (R² = 0.9705, RMSE = 0.0067) using the Subchondral Bone Area of the Medial Tibia (SUBBAREA_MEDTIB) and Curvature Standard Deviation of the Subchondral Bone in the Medial Femur (CURSD_SUBB_MEDFEM). The proposed framework combines regression-based models with evolutionary feature selection to improve interpretability and achieve state-of-the-art prediction of Knee Osteoarthritis (KOA) progression using biologically meaningful targets.
Keywords: Knee Osteoarthritis (KOA), Machine Learning, Biomarkers, Feature Selection, Regression Modeling, Cartilage Morphology, Bone Biomechanics, CTX-II, NTX-I, CTX-I, hsCRP, C1M, C2M, WMTMTH, MTCT, XRJSM, WMTVCL, EMTMTH, SUBBAREA_MEDTIB, CURSD_SUBB_MEDFEM
Recommended Citation
Sri Sai Vemuri, Varun, "Strategic identification of prognostic biomarkers for knee osteoarthritis via optimized regression techniques" (2025). Selected Full-Text Master Theses 2021-. 48.
https://digitalcommons.liu.edu/brooklyn_fulltext_master_theses/48