A SYSTEMS-LEVEL APPROACH TO EARLY DETECTION OF METABOLIC SYNDROME IN ADOLESCENTS USING MACHINE LEARNING AND BIOLOGICAL MARKERS

Main Article Content

Rabia Zulfiqar
Nargis Khan
Gull Hassan Shethar
Tasneem Munir

Abstract

Background: Metabolic Syndrome (MetS) during adolescence is increasingly prevalent and closely linked to the future onset of cardiovascular disease, type 2 diabetes, and other chronic metabolic conditions. Traditional diagnostic approaches often fail to reflect the complex, multifactorial nature of the syndrome, particularly in younger populations. An integrative, data-driven strategy is essential to improve early identification and enable targeted prevention in at-risk adolescents.


Objective: To develop and evaluate machine learning (ML)-based predictive models that integrate biological and clinical markers for early detection of MetS in adolescents.


Methods: This cross-sectional study analyzed data from 110 adolescents aged 10–18 years, including anthropometric measures (BMI, waist circumference), biochemical markers (fasting glucose, triglycerides, HDL cholesterol, fasting insulin, HOMA-IR), blood pressure, and lifestyle indicators. After data preprocessing and normalization, key features were selected using recursive feature elimination and mutual information techniques. Supervised ML models—Gradient Boosting, Random Forest, Support Vector Machines (SVM), and Neural Networks—were trained and evaluated using 10-fold cross-validation. Model performance was assessed using accuracy, precision, recall, F1 score, and the area under the receiver operating characteristic curve (AUC-ROC). SHAP (Shapley Additive Explanations) analysis was employed for interpretability of feature contributions.


Results: Gradient Boosting outperformed all other models with an accuracy of 90.0%, precision of 0.86, recall of 0.84, and AUC-ROC of 0.92. Random Forest followed with 89.1% accuracy and 0.91 AUC-ROC. SVM and Neural Networks achieved 85.5% and 88.2% accuracy, respectively. SHAP analysis revealed waist circumference (r = 0.68), triglycerides (r = 0.63), HOMA-IR (r = 0.59), fasting insulin (r = 0.50), and HDL cholesterol (r = –0.56) as the top contributors to MetS prediction.


Conclusion: Ensemble ML methods, especially Gradient Boosting, demonstrated high predictive accuracy in identifying adolescents at risk for MetS using integrated clinical and biological data. These models offer promise for early, personalized interventions and warrant validation in larger and longitudinal cohorts.

Article Details

Section
Articles
Author Biographies

Rabia Zulfiqar , King Edward Medical University, Lahore, Pakistan.

 Student, Certificate of Medical Teaching, Department of Community Medicine, King Edward Medical University, Lahore, Pakistan.

Nargis Khan , Dow University of Health Sciences (DUHS) Karachi, Pakistan.

Associate Professor, Department of Medicine, Dow University of Health Sciences (DUHS) Karachi, Pakistan.

Gull Hassan Shethar, Al-Amiri Hospital, Kuwait.

Consultant, Department of Medicine, Al-Amiri Hospital, Kuwait.

Tasneem Munir , , Lahore General Hospital, Lahore, Pakistan.

Surgical Technologist, Lahore General Hospital, Lahore, Pakistan.