Selected Full-Text Master Theses 2021-

A machine learning based framework for predicting drug cardiotoxicity using a combination of ECG biomarkers and drug dosage data

Jamie Wong, Long Island University

Date of Award

2025

Document Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Science

Committee Chair and Members

Reda Nacif Elalaoui, Chair

Abla Bedoui

Keywords

Artificial Intelligence in healthcare, Cardiotoxicity prediction, Deep learning models, Drug safety monitoring, Electrocardiogram (ECG) biomarkers, Machine learning classification

Abstract

Drug-induced cardiotoxicity presents a significant challenge in clinical practice and drug clinical development, particularly with medications that modulate calcium, potassium, and sodium channels that influence cardiac electrophysiology. Clinical practice often relies on QTc prolongation alone as a predictor, which lacks specificity and may lead to excluding other safe therapeutic options. To address this limitation, this study integrates electrocardiogram (ECG) biomarkers with normalized drug dosage data to improve the accuracy of cardiotoxicity risk prediction using machine learning techniques. ECG features, including QT, QRS, RR, and PR intervals, were analyzed alongside normalized dosage data to account for dose-dependent cardiac effects. A physiologically driven risk scoring algorithm was developed to categorize cardiotoxicity severity, which was then used to train multiple machine learning models: convolutional neural networks (CNN), gradient-boosted trees (XGBoost and LightGBM), and multi-task neural networks (MTNN). Among the four models tested, CNN demonstrated perfect recall (1.000) and an ROC AUC of 0.995, indicating strong sensitivity to subtle ECG variations. However, CNN showed the lowest precision (0.488) and F1-score (0.655), reflecting a high false positive rate. In contrast, XGBoost achieved a strong overall balance, with an accuracy of 0.997, precision of 0.907, F1-score of 0.951, and a perfect ROC AUC (1.000), effectively capturing ECG and dosage interactions. LightGBM outperformed across most metrics, with the highest accuracy (0.998), precision (0.929), F1-score (0.963), and a perfect ROC AUC, making it a reliable model for confident classification. MTNN also showed excellent performance with high precision (0.927), recall (0.974), F1-score (0.950), and a perfect ROC AUC. These findings highlight the importance of integrating different ECG features with pharmacologic data in order to realize the significance of drug-induced cardiotoxicity risk. Evaluating the models’ strengths reflects different clinical needs, such as broad screening (CNN) to high-confidence risk classification (LightGBM). This approach advocates for earlier and more precise identification of proarrhythmic risk, enhancing patient safety and optimizing decision-making in clinical practice as well as in cardiovascular drug development.

Recommended Citation

Wong, Jamie, "A machine learning based framework for predicting drug cardiotoxicity using a combination of ECG biomarkers and drug dosage data" (2025). Selected Full-Text Master Theses 2021-. 41.
https://digitalcommons.liu.edu/brooklyn_fulltext_master_theses/41

Download

Included in

Computer Sciences Commons

COinS

Digital Commons @ LIU

Selected Full-Text Master Theses 2021-

A machine learning based framework for predicting drug cardiotoxicity using a combination of ECG biomarkers and drug dosage data

Date of Award

Document Type

Degree Name

Department

Committee Chair and Members

Keywords

Abstract

Recommended Citation

Included in

Links

Browse

Search

Author Corner

Digital Commons @ LIU

Selected Full-Text Master Theses 2021-

A machine learning based framework for predicting drug cardiotoxicity using a combination of ECG biomarkers and drug dosage data

Author

Date of Award

Document Type

Degree Name

Department

Committee Chair and Members

Keywords

Abstract

Recommended Citation

Included in

Share

Links

Browse

Search

Author Corner