Date of Award
2025
Document Type
Thesis
Degree Name
Master of Science (MS)
Department
Computer Science
Committee Chair and Members
Reda Nacif Elalaoui, Chair
Abla Bedoui
Keywords
Artificial Intelligence in healthcare, Cardiotoxicity prediction, Deep learning models, Drug safety monitoring, Electrocardiogram (ECG) biomarkers, Machine learning classification
Abstract
Drug-induced cardiotoxicity presents a significant challenge in clinical practice and drug clinical development, particularly with medications that modulate calcium, potassium, and sodium channels that influence cardiac electrophysiology. Clinical practice often relies on QTc prolongation alone as a predictor, which lacks specificity and may lead to excluding other safe therapeutic options. To address this limitation, this study integrates electrocardiogram (ECG) biomarkers with normalized drug dosage data to improve the accuracy of cardiotoxicity risk prediction using machine learning techniques. ECG features, including QT, QRS, RR, and PR intervals, were analyzed alongside normalized dosage data to account for dose-dependent cardiac effects. A physiologically driven risk scoring algorithm was developed to categorize cardiotoxicity severity, which was then used to train multiple machine learning models: convolutional neural networks (CNN), gradient-boosted trees (XGBoost and LightGBM), and multi-task neural networks (MTNN). Among the four models tested, CNN demonstrated perfect recall (1.000) and an ROC AUC of 0.995, indicating strong sensitivity to subtle ECG variations. However, CNN showed the lowest precision (0.488) and F1-score (0.655), reflecting a high false positive rate. In contrast, XGBoost achieved a strong overall balance, with an accuracy of 0.997, precision of 0.907, F1-score of 0.951, and a perfect ROC AUC (1.000), effectively capturing ECG and dosage interactions. LightGBM outperformed across most metrics, with the highest accuracy (0.998), precision (0.929), F1-score (0.963), and a perfect ROC AUC, making it a reliable model for confident classification. MTNN also showed excellent performance with high precision (0.927), recall (0.974), F1-score (0.950), and a perfect ROC AUC. These findings highlight the importance of integrating different ECG features with pharmacologic data in order to realize the significance of drug-induced cardiotoxicity risk. Evaluating the models’ strengths reflects different clinical needs, such as broad screening (CNN) to high-confidence risk classification (LightGBM). This approach advocates for earlier and more precise identification of proarrhythmic risk, enhancing patient safety and optimizing decision-making in clinical practice as well as in cardiovascular drug development.
Recommended Citation
Wong, Jamie, "A machine learning based framework for predicting drug cardiotoxicity using a combination of ECG biomarkers and drug dosage data" (2025). Selected Full-Text Master Theses 2021-. 41.
https://digitalcommons.liu.edu/brooklyn_fulltext_master_theses/41