Skip to main content
Skip main navigation

Machine Learning-Based Discrimination of Cardiovascular Outcomes in Patients With Hypertrophic CardiomyopathyOpen Access

Original Research

JACC: Asia, 4 (5) 375–386
Sections

Central Illustration

Abstract

Background

Current risk stratification strategies for patients with hypertrophic cardiomyopathy (HCM) are limited to traditional methodologies.

Objectives

The authors aimed to establish machine learning (ML)-based models to discriminate major cardiovascular events in patients with HCM.

Methods

We enrolled consecutive HCM patients from 2 tertiary referral centers and used 25 clinical and echocardiographic features to discriminate major adverse cardiovascular events (MACE), including all-cause death, admission for heart failure (HF-adm), and stroke. The best model was selected for each outcome using the area under the receiver operating characteristic curve (AUROC) with 20-fold cross-validation. After testing in the external validation cohort, the relative importance of features in discriminating each outcome was determined using the SHapley Additive exPlanations (SHAP) method.

Results

In total, 2,111 patients with HCM (age 61.4 ± 13.6 years; 67.6% men) were analyzed. During the median 4.0 years of follow-up, MACE occurred in 341 patients (16.2%). Among the 4 ML models, the logistic regression model achieved the best AUROC of 0.800 (95% CI: 0.760-0.841) for MACE, 0.789 (95% CI: 0.736-0.841) for all-cause death, 0.798 (95% CI: 0.736-0.860) for HF-adm, and 0.807 (95% CI: 0.754-0.859) for stroke. The discriminant ability of the logistic regression model remained excellent when applied to the external validation cohort for MACE (AUROC = 0.768), all-cause death (AUROC = 0.750), and HF-adm (AUROC = 0.806). The SHAP analysis identified left atrial diameter and hypertension as important variables for all outcomes of interest.

Conclusions

The proposed ML models incorporating various phenotypes from patients with HCM accurately discriminated adverse cardiovascular events and provided variables with high importance for each outcome.

Introduction

Hypertrophic cardiomyopathy (HCM) is an inheritable myocardial disease with a prevalence of 1:500 in the general population.1 HCM is a well-known leading cause of sudden cardiac death (SCD), especially in young individuals.2,3 Thus, the current risk stratification approach for patients with HCM has primarily focused on SCD.4,5 Given the advancing age of newly diagnosed HCM patients,6 however, it is not surprising that their life expectancy and quality of life are largely determined by cardiovascular complications, such as heart failure (HF),7 stroke,8 or atrial fibrillation (AF).9 Notwithstanding, well-validated prediction models for adverse cardiovascular events are hitherto rare in HCM.

Machine learning (ML)-based models containing multidimensional variables predict adverse events with more precision and generalizability than conventional risk predictors in various cardiovascular diseases.10 By incorporating high-order and nonlinear interactions among variables, ML methods provide improved predictive ability compared with the standard regression techniques.11 Therefore, several ML models for predicting unfavorable cardiovascular outcomes in patients with HCM have been suggested.12-14 However, previous models have not been validated in external independent cohorts, substantially limiting the generalizability of the established models. Thus, we aimed to establish and validate data-driven ML-based models to discriminate major cardiovascular events using 2 independent largescale HCM cohorts. Further, we utilized an explainable ML method to provide novel insights from important features affecting the decision-making process of ML prediction models.

Methods

Study design and population

The study population consisted of patients with HCM who received their first echocardiography examination between 2003 and 2020 at 2 tertiary referral hospitals in Korea (Seoul National University Hospital and Seoul National University Bundang Hospital) (Figure 1). HCM was defined as an increased left ventricular (LV) wall thickness (end-diastolic LV wall thickness ≥15 mm or ≥13 mm in individuals with a familial history of HCM), with LV hypertrophy unattributable to secondary causes such as hypertension or aortic stenosis.5 In order to establish the ML models, the cohort from Seoul National University Hospital (n = 1,006) was assigned as the derivation cohort. The Seoul National University Bundang Hospital cohort (n = 1,105) was the external validation cohort used to evaluate the predictive ability of the best prediction model. The study protocol was approved by the institutional review board of each hospital and conducted according to the principles of the Declaration of Helsinki. Written informed consent was waived due to the retrospective nature of the study.

Figure 1
Figure 1

Study Flow

Patients who were diagnosed with hypertrophic cardiomyopathy (HCM) between 2003 and 2020 were consecutively enrolled. Two independent cohorts were used for derivation and validation cohort, respectively. Using 25 clinical and echocardiographic features, major cardiovascular adverse events (MACE) and its individual outcomes during median 4.0 years of follow-up were analyzed by 4 different machine-learning based discriminative models. AI = artificial intelligence; HF = heart failure; LDA = linear discriminant analysis; LR = logistic regression; RF = random forest; SHAP = Shapley additive explanation; SNUBH = Seoul National University Bundang Hospital; SNUH = Seoul National University Hospital; SVM = support vector machine.

Echocardiographic examination

Three distinct vendors (GE Medical Systems, Philips Healthcare, and Siemens Medical Solutions) were utilized for the echocardiographic examination. Dimensions of the LV and left atrium (LA) and LV end-diastolic wall thickness were measured on parasternal long-axis views. In addition, the LV ejection fraction was evaluated on the apical 4- and 2-chamber views using the biplane Simpson's method following the guidelines.15 The LV outflow tract pressure gradient was measured at rest and with the Valsalva maneuver, with the maximum value acquired. Apical HCM was defined as pathological hypertrophy of the LV limited to the apical segments of the ventricle.

Feature selection and data preprocessing

We first excluded 5 echocardiographic parameters (LV internal diameter at end-systole, LV end-diastolic volume index, stroke volume index, medial e’, and E velocity) from 30 features due to a multicollinearity issue. Because the proportion of missing values for each feature was <5%, no feature was removed due to the excessive missing values. As a result, 15 clinical and 10 echocardiographic features were included in the analysis. The variables used in the final analysis are as follows: Age, sex, body mass index, hypertension, diabetes mellitus, dyslipidemia, history of HF, AF, history of stroke, coronary artery disease, valvular heart disease, history of cancer, family history of SCD, history of syncope, apical type HCM, use of beta-blocker, maximum LV wall thickness, maximum LV outflow tract pressure gradient, LV internal diameter at end-diastole, LV end-systolic volume index (LVESVi), LV ejection fraction, LA diameter, deceleration time, E/e’, and estimated pulmonary artery systolic pressure. The K nearest neighbor imputation method was used to fill in missing values to prevent the loss of important information and unstable model implementation. All results were analyzed based on the imputed dataset.

Study outcomes and longitudinal follow-up

The outcomes of interest were MACE (defined as a composite of all-cause mortality, HF admission [HF-adm], and stroke) and the individual components. Using the National Death Registration Records of Korea (independently managed by the Korean government), the vital status of all study participants was verified. HF-adm was defined as at least 1 episode of hospitalization owing to HF, which was clinically diagnosed based on aggravating symptoms and indications of congestion with volume overloads, including dyspnea and peripheral edema, or using diuretic agents for volume overload.16 Stroke was a sudden neurologic impairment due to vascular pathologies of the brain, such as thromboembolism, hemorrhage, or ruptured aneurysm, that lasted over 24 hours. Dedicated research personnel gathered clinical outcomes by reviewing electronic health records and performing telephone interviews. From the initial echocardiographic examination date, participants were followed up until the event, the end of follow-up, or death, whichever occurred first. The median follow-up was 4.0 years (Q1-Q3: 1.6-7.6 years).

Development of ML models and external validation of the best prediction model

In contrast to the traditional methodologies, ML-based methods utilize all provided variables through multiple iteration processes to select the optimal model.17 Consequently, ML-based methods can manage nonlinearity, high dimensionality, and variable interactions, delivering more accurate variable importance and superior predictive performance.18 In this study, 4 ML classifiers—logistic regression (LR), linear discriminant analysis, random forest, and support vector machine—were used to construct the discriminative models for each target outcome. The 20-fold cross-validation method was used for the fine-tuning of the optimal model parameters in the derivation cohort. The performance of the established model in the derivation cohort was assessed using the area under the receiver operating characteristic curve (AUROC) and the calibration plot by deciles of predicted risk of event. Based on the AUROC values, the best performing model was chosen for each outcome. We then evaluated the discriminant ability of ML models in the external validation cohort. A sensitivity analysis was followed as a cross-over analysis with the roles of the derivation and validation cohorts switched. In addition, we performed a hold-out cross-validation analysis by merging the cohorts into a single pooled dataset of 2,111 patients, which was then randomly split into training and test cohorts in a 7:3 manner.

Feature importance analysis using the Shapley additive explanations method

We utilized the Shapley additive explanations (SHAP) method to rank the relative importance of each feature incorporated in the ML models with best discriminant ability established from the derivation cohort. Using the SHAP method, the Shapley values, indicating the contribution of each feature to the predictive probability of target outcomes, were estimated.19 Considering the possible interactions among all features, the Shapley value of a feature represents the effect of deleting the feature from the prediction model.20 Model development was done using Python software version 3.7.6 (Python Software Foundation) and Scikit Learn software version 1.0.2 (scikit-learn Project). The SHAP analysis results were reported as radar plots and the SHAP summary plots, depicting scaled importance (the relative importance of a feature scaled with respect to the feature with the greatest relative importance value) of each feature.

Statistical analysis

The chi-square test was utilized to compare categorical variables expressed as numbers and relative frequencies (percentages). Continuous variables were reported as mean ± SD. We utilized the Kolmogorov-Smirnov test to determine whether the continuous variables had a normal distribution. Student's t-test was used to analyze differences between continuous characteristics with a normal distribution; otherwise, the Mann-Whitney U test was employed. All probability values were 2-sided, and P values <0.05 were considered significant. Analyses were conducted using Stata software version 17.0 (StataCorp).

Results

Characteristics and clinical outcomes of the study population

The baseline characteristics of the derivation and validation cohorts are presented in Table 1. In total, 2,111 patients (derivation cohort n = 1,006; validation cohort n = 1,105) with HCM (mean age 61.4 ± 13.6 years; 67.6% men) were analyzed. The distribution of age, sex, and body mass index were comparable between the 2 cohorts. Among HCM-related factors, the derivation cohort had a higher proportion of patients with a history of syncope and apical type HCM. The proportion of patients on beta-blockers was higher in the validation cohort. The incidence rate of each outcome is presented in Table 2. During the median follow-up of 4.0 years, 3.72 per 100 person-years in the derivation cohort and 3.35 per 100 person-years in the validation cohort had MACE events.

Table 1 Baseline Characteristics of Study Population

Derivation Cohort (n = 1,006)Validation Cohort (n = 1,105)P Value
Demographic features
 Age, y61.6 ± 13.361.3 ± 13.80.630
 Male67.2 (676)67.9 (750)0.740
 Body mass index, kg/m224.8 ± 3.324.9 ± 3.50.247
Comorbidities
 Hypertension45.2 (455)62.8 (694)<0.001
 Diabetes mellitus19.1 (192)22.7 (251)0.041
 Dyslipidemia23.6 (237)44.9 (496)<0.001
 History of heart failure4.1 (41)4.8 (53)0.423
 Atrial fibrillation21.5 (216)16.6 (183)0.004
 History of stroke4.2 (42)13.2 (146)<0.001
 Coronary artery disease10.9 (110)8.0 (88)0.019
 Valvular heart disease3.9 (39)4.9 (54)0.259
 History of cancer11.0 (111)13.0 (144)0.159
HCM-related risk factors
 Familial history of SCD7.6 (76)7.1 (79)0.721
 History of syncope13.3 (134)7.6 (84)<0.001
 Apical type43.2 (435)38.6 (426)0.029
 Beta-blocker use35.6 (358)69.4 (767)<0.001
Echocardiographic parameters
 Maximal LV wall thickness, mm18.3 ± 4.018.0 ± 3.80.039
 Maximal LVOT pressure gradient, mm Hg13.9 ± 27.216.3 ± 31.30.068
 LVEDVi, mL/m286.3 ± 19.181.2 ± 19.1<0.001
 LVESVi, mL/m230.7 ± 10.430.1 ± 11.90.230
 LVEF, %63.6 ± 6.963.8 ± 7.20.471
 LA diameter, mm45.8 ± 7.542.3 ± 7.6<0.001
 E velocity, m/s0.6 ± 0.20.7 ± 0.2<0.001
 Deceleration time, msec211.6 ± 90.5205.3 ± 68.60.073
 Medial e’, cm/s4.7 ± 1.65.5 ± 2.0<0.001
 E/e'14.4 ± 6.913.5 ± 6.60.005
 Estimated PASP, mm Hg17.6 ± 17.729.5 ± 9.9<0.001

Values are mean ± SD or % (n).

HCM = hypertrophic cardiomyopathy; LA = left atrium; LV = left ventricle; LVEDVi = left ventricular end-diastolic volume index; LVEF = left ventricular ejection fraction; LVESVi = left ventricular end-systolic volume index; LVIDd = left ventricular internal dimension at end-diastole; LVIDs = left ventricular internal dimension at end-systole; LVOT = left ventricular outflow tract; PASP = pulmonary arterial systolic pressure; SCD = sudden cardiac death.

Table 2 Incidence Rate of Study Outcomes

Derivation Cohort (n = 1,006)Validation Cohort (n = 1,105)
MACE
 Incidence rate, per 100 person-years3.723.35
All-cause death
 Incidence rate, per 100 person-years2.461.11
HF admission
 Incidence rate, per 100 person-years1.511.24
Stroke
 Incidence rate, per 100 person-years1.431.42

HF = heart failure; MACE = major adverse cardiovascular event(s).

Development and external validation of the ML prediction model for clinical outcomes

The ability of 4 ML models to discriminate clinical outcomes is depicted in Figure 2. The 20-fold cross-validated AUROC was 0.70 to 0.80 for each ML model, indicating excellent discriminant ability for all target outcomes (Table 3). The LR model demonstrated the best performance for all outcomes of interest among the 4 ML models. The 20-fold cross-validated AUROC was 0.800 (95% CI: 0.760-0.841) for MACE, 0.789 (95% CI: 0.736-0.841) for all-cause death, 0.798 (95% CI: 0.736-0.860) for HF-adm, and 0.807 (95% CI: 0.754-0.859) for stroke. Calibration bar plots by deciles of the observed vs predicted risk estimated using the LR model are presented for each outcome in Figure 3. Overall, the LR model–predicted probability corresponded with the observed probability of each event in the derivation cohort. Figure 4 and Table 4 present the AUROC curves and detailed metrics of the LR model performances in the external validation cohort. When applied to the validation cohort, the performance of the LR model remained excellent for MACE (AUROC = 0.771), all-cause death (AUROC = 0.759), and HF-adm (AUROC = 0.807), except for stroke (AUROC = 0.694).

Figure 2
Figure 2

Performance of 4 ML Models for Each Clinical Outcome

Predictive ability of 4 machine learning (ML) discriminative models for (A) MACE, (B) death, (C) HF admission, and (D) stroke are presented as receiver-operating characteristic (ROC) curves and compared by the area under the receiver-operating curve (AUROC) with 95% CI. Abbreviations as in Figure 1.

Table 3 Predictive Performance of Machine Learning-Based Models by 20-Fold Cross-Validation for Each Clinical Outcome in Derivation Cohort

AUCSensitivitySpecificityPPVNPV
MACE
 Logistic regression0.800 (0.760-0.841)0.655 (0.592-0.718)0.795 (0.740-0.851)0.460 (0.393-0.527)0.912 (0.899-0.926)
 Linear discriminant analysis0.800 (0.760-0.840)0.659 (0.595-0.722)0.794 (0.738-0.850)0.456 (0.398-0.515)0.914 (0.901-0.926)
 Random forest0.793 (0.752-0.833)0.771 (0.705-0.837)0.688 (0.608-0.768)0.394 (0.334-0.453)0.934 (0.921-0.946)
 Support vector machine0.787 (0.746-0.829)0.666 (0.597-0.735)0.788 (0.718-0.858)0.488 (0.396-0.580)0.914 (0.900-0.927)
All-cause death
 Logistic regression0.789 (0.736-0.841)0.698 (0.599-0.796)0.720 (0.645-0.795)0.344 (0.249-0.438)0.943 (0.929-0.958)
 Linear discriminant analysis0.786 (0.734-0.837)0.667 (0.561-0.772)0.747 (0.668-0.827)0.365 (0.270-0.459)0.941 (0.927-0.955)
 Random forest0.782 (0.731-0.834)0.707 (0.627-0.787)0.704 (0.635-0.772)0.288 (0.236-0.340)0.943 (0.931-0.956)
 Support vector machine0.769 (0.707-0.831)0.608 (0.547-0.734)0.785 (0.730-0.818)0.308 (0.272-0.345)0.934 (0.923-0.951)
HF admission
 Logistic regression0.798 (0.736-0.860)0.671 (0.621-0.721)0.675 (0.575-0.775)0.212 (0.118-0.305)0.954 (0.936-0.972)
 Linear discriminant analysis0.792 (0.734-0.850)0.633 (0.557-0.710)0.739 (0.652-0.827)0.257 (0.157-0.357)0.960 (0.951-0.969)
 Random forest0.776 (0.713-0.839)0.550 (0.448-0.652)0.789 (0.716-0.862)0.281 (0.167-0.395)0.956 (0.948-0.964)
 Support vector machine0.728 (0.666-0.790)0.596 (0.491-0.701)0.676 (0.584-0.768)0.171 (0.116-0.226)0.954 (0.946-0.962)
Stroke
 Logistic regression0.807 (0.754-0.859)0.600 (0.509-0.691)0.758 (0.676-0.840)0.287 (0.156-0.418)0.960 (0.952-0.967)
 Linear discriminant analysis0.804 (0.752-0.856)0.596 (0.508-0.684)0.769 (0.697-0.841)0.274 (0.160-0.389)0.960 (0.952-0.968)
 Random forest0.773 (0.710-0.835)0.671 (0.615-0.727)0.693 (0.617-0.768)0.191 (0.121-0.262)0.962 (0.955-0.969)
 Support vector machine0.762 (0.690-0.834)0.588 (0.498-0.677)0.791 (0.737-0.845)0.243 (0.149-0.337)0.961 (0.953-0.968)

Values in ( ) are 95% CI.

AUC = area under the receiver operating characteristic curve; NPV = negative predictive value; PPV = positive predictive value; other abbreviations as in Table 2.

Figure 3
Figure 3

Observed Risk of Outcomes According to Deciles of Predicted Probability

The predicted and observed risk probability of (A) MACE, (B) death, (C) HF admission, and (D) stroke calculated by the logistic regression machine learning model are presented. Abbreviations as in Figures 1 and 2.

Figure 4
Figure 4

External Validation of LR ML Model

Ability of the best discriminative model (logistic regression machine learning model) for (A) MACE, (B) death, (C) HF admission, and (D) stroke was tested in the external validation cohort and presented with AUROC. Abbreviations as in Figures 1 and 2.

Table 4 Predictive Performance of Machine Learning-Based Logistic Regression Model for Each Clinical Outcome in Validation Cohort

AUCSensitivitySpecificityPPVNPV
MACE0.7710.7050.7370.3060.938
All-cause death0.7590.7460.6570.1020.980
HF admission0.8070.7000.8180.1810.979
Stroke0.6940.9120.3800.0880.985

Abbreviations as in Tables 2 and 3.

Sensitivity analysis

Sensitivity analyses were performed to replicate the results. As a cross-over analysis, LR-based ML models were developed from the Seoul National University Bundang Hospital cohort (validation cohort in the main analysis) and subsequently tested in the Seoul National University Hospital cohort (derivation cohort in the main analysis) (Supplemental Table 1). The discriminant ability for MACE, all-cause death, and HF-adm was 0.759 to 0.790 when applied to the cross-over validation cohort. Similar to the main result, the LR model performance decreased to AUROC of 0.655 for stroke prediction when externally validated.

A hold-out cross-validation analysis was performed as another sensitivity analysis (Supplemental Table 2). Random splitting of the merged total dataset in a 7:3 manner yielded training sets (n = 1,478) and test sets (n = 633) for analysis. The LR-based ML model established in the training set yielded an AUROC of 0.757 to 0.828. The model performance remained excellent in the test set, with an AUROC of 0.778 to 0.824 for all outcomes.

To assess the impact of the missing data, a complete data analysis without imputation was performed (Supplemental Tables 3 and 4, Supplemental Figure 1). Both the main result and the SHAP analysis results were consistent, demonstrating the robustness of established ML models across all outcomes.

SHAP feature importance analysis

The relative importance of the top 8 features as significant factors for each clinical outcome was determined using Shapley values (Figure 5). The high-rank variables varied depending on the outcomes of interest. Variables with the highest importance were age (for MACE and all-cause death), AF (for HF-adm), and apical HCM (as a protective predictor for stroke). Increased LA diameter and hypertension, which ranked highly in the ML models, strongly associated with all 4 outcomes. Increased LVESVi and cancer history were substantially linked with more than 2 individual outcomes (for LVESVi, all-cause death, and HF-adm; for cancer history, all-cause death, and stroke) and MACE.

Figure 5
Figure 5

Relative Importance of Features in Discriminative Models by SHAP Values

Feature importance was determined and presented with the SHAP values that represent the relative contribution of each feature to the model. Features are listed from the top in order of their relative importance. AF = atrial fibrillation; BMI = body mass index; HTN = hypertension; LA = left atrium; LVESVi = left ventricular end-systolic volume index; SHAP = Shapley additive explanation; other abbreviations as in Figures 1 and 2.

The variables selected through the traditional stepwise logistic regression method were compared to the variables identified using the SHAP method (Supplemental Table 5). One consistent finding was the significance of the LA diameter across all outcomes. However, there were notable differences in the lists of selected variables for each outcome.

Discussion

In this study, we developed ML-based discriminative models using independent consecutive HCM cohorts from 2 tertiary referral centers, focusing on the risk of MACE, including all-cause death, HF-adm, and stroke (Central Illustration). Our findings can be summarized as follows. First, among the 4 ML models, the LR-based ML algorithm had the best discriminant ability for all 4 outcomes. Second, LR models for all outcomes were well-calibrated and maintained good discriminant ability when applied to the external validation cohort, except for stroke. Third, the relative importance of clinical and echocardiographic parameters in discriminating each outcome was determined using the SHAP analysis. We observed that LA diameter and hypertension had substantial importance in discriminating all four outcomes.

Central Illustration
Central Illustration

Machine Learning-Based Discrimination of Cardiovascular Outcomes in Hypertrophic Cardiomyopathy Patients

This study aimed to develop and validate machine learning (ML)-based models for discriminating major cardiovascular events in hypertrophic cardiomyopathy (HCM) patients, using data from 2 largescale HCM cohorts from independent tertiary referral centers. Among the 4 ML models, the logistic regression-based ML algorithm had the best discriminant ability for all 4 outcomes. The relative importance of clinical and echocardiographic parameters in discriminating each outcome was determined using the SHapley Additive exPlanations (SHAP) analysis. We observed that left atrial (LA) diameter and hypertension (HTN) had substantial importance in predicting all 4 outcomes in patients with HCM. AF = atrial fibrillation; AUC = area under the curve; BMI = body mass index; HF = heart failure; LDA = linear discriminant analysis; LR = logistic regression; LVESi = left ventricular end-systolic volume index; MACE = major cardiovascular adverse event(s); RF = random forest; SVM = support vector machine.

In patients with HCM, major cardiovascular complications, such as HF or stroke, significantly affect their quality of life and prognosis.21 Patients with HCM are at a substantially greater risk of mortality owing to cardiovascular diseases than those without.4 In contrast to SCD, studies on prediction models and predictors for adverse cardiovascular outcomes in HCM are anecdotal. Current prediction models depend on single-center observational data, established based on traditional methodologies with hand-crafted criteria.2 These models may be vulnerable and perform poorly when applied to the new data.2 By thoroughly incorporating multidimensional data and factors that interact in linear and nonlinear manners, the ML-based methodology may provide a model with significantly enhanced prediction performance.22 Indeed, ML models to predict ventricular arrhythmia,14 HF,13,14 or composite cardiovascular events12 were recently presented in HCM populations. However, all previous studies only offered an ML model based on single-center data that had not been verified in an external cohort. This is a critical drawback that substantially limits the applicability of the established models. There is still room for improvement regarding the precision and transferability of these ML models, along with a need for tailored approaches in HCM patient care.

Our study has the following methodological strengths: 1) for the first time, we established the generalizability of a model developed from 2 independent HCM cohorts (each with over 1,000 patients); 2) we analyzed and compared 4 ML approaches frequently used in classification, including LR, linear discriminant analysis, random forest, and support vector machine, to determine the most appropriate ML prediction model for our data distribution; and 3) model training with 20-fold cross-validation. Moreover, sensitivity analyses using cross-over or hold-out cross-validation further improved and consolidated the reliability of our results.

The LR ML model provided the best discriminant ability for all 4 outcomes of interest in this study population. However, the lowest AUROCs of ML models in our study were still comparable to those of prior research, falling in the range of 0.73 to 0.79. Consequently, the data structure and distribution from the derivation and validation cohorts used in this study were generally suitable for developing and applying ML-based models. This indicates that our data may be utilized to train and validate other ML-based models with novel methodologies.

Notably, the discriminant performance of the model for stroke dropped significantly compared with other outcomes during external validation. A discrepancy in the baseline distribution between the derivation and validation cohorts might have contributed to this result. In particular, features directly associated with the occurrence of stroke, such as previous history of stroke, prevalence of AF, and LA diameter in echocardiography examination, significantly differed between 2 cohorts. Furthermore, because cohorts used in this study mainly focused on HCM and its cardiovascular outcomes, unmeasured factors that may significantly affect stroke occurrence (such as carotid artery stenosis, smoking status, alcohol intake, or physical activity)23 may not have been considered in our models. To further improve the efficacy of the stroke discriminative models, subsequent studies using cohorts with various phenotypes are helpful.

The ML model is a black-box system, making it challenging to comprehend decision-making. This results in limited implications on clinical practice despite the enhanced discriminant ability.24 To overcome these limitations, we used the SHAP method—an explainable AI methodology based on game theory that indexes and communicates the degree of the relative contribution of features to the discriminative model via the Shapley value. The SHAP analysis demonstrated that the relative importance of variables changes meaningfully based on the outcomes. These outcome-specific important features should be considered in further studies with larger, more comprehensive predictive models for cardiovascular outcomes in HCM patients. The significant discrepancies between the traditional stepwise variable selection and ML-based SHAP approach highlight an opportunity to update significant predictors considering high-dimensional intervariable interactions, and may provide novel clinical insights as well as new treatment targets.

The SHAP analysis revealed that LA diameter and hypertension were common significant factors of the 4 outcomes of interest. Although routinely measured in echocardiography examinations, the size of the LA cavity is often overlooked in clinical practice. As a representative, chronic marker for the diastolic function of the LV, the LA size should be considered an essential imaging index for long-term prognosis in patients with HCM.25 It is crucial to investigate how the LA reservoir strain—recently reported as a novel parameter with a strong predictive ability for incident HF in patients with HCM—may influence the ML-based model enhancement.26 Hypertension is the leading cause of cardiovascular disease, substantially impacting mortality, HF, and stroke.27 In our study cohorts, 54.4% of patients diagnosed with HCM had concomitant hypertension. This proportion is higher than that in studies recruiting a relatively young Western HCM population.12,13 However, it is comparable to Japanese data28 or the Korean national health insurance database,1 reflecting the different characteristics of the study population. Uncontrolled hypertension enhances LV hypertrophy and hastens unfavorable remodeling of the myocardium.29 Considering the continually aging HCM population,6 it is necessary to carefully examine hypertension in patients with HCM at diagnosis and during follow-up, and to make every effort to manage blood pressure adequately. Furthermore, the LV chamber size and the patient's malignancy history, which were noted as additional key determinants in the occurrence of unfavorable cardiovascular events, including mortality, should not be ignored at the time of HCM diagnosis, especially in the contemporary era of the enhanced HCM care.

Notably, the apical type HCM was the most important factor inversely correlated with the incident stroke. Similarly, a Japanese HCM cohort study observed that apical HCM is linked with a lower incidence of thromboembolic events, including stroke.30 Apical HCM is characterized by less severe diastolic dysfunction and a smaller extent of myocardial fibrosis than HCM with septal hypertrophy.31 In this study, patients with non-apical HCM had a higher E/e’ ratio (14.7 ± 7.4 vs 12.9 ± 5.5) despite similar sizes of LA cavity than those with apical type, suggesting a higher chance of LV diastolic dysfunction in patients with non-apical HCM. Furthermore, patients with HCM with a greater extent of septal hypertrophy were prone to AF, possibly explaining the higher risk of stroke in patients with non-apical HCM.32 Because our data are limited to provide a clear explanation for causal relationship, this finding is currently hypothesis-generating at best. Future prospective studies should validate the risk of stroke according to the type of HCM and reveal its mechanistic background.

Study limitations

First, this study was based on real-world observational Asian cohort data. Therefore, the generalizability of the current ML models should be reinvestigated internationally using external HCM cohorts. Our established ML model pipeline can also be applied for validation within specific populations by age, sex, or underrepresented groups in future research. Second, we used ML classifiers that did not account for the time variable and, thus, did not consider the longitudinal aspect of the cohort study. However, our primary objective in this research was to compare various well-established and widely used ML classifiers. Third, the short follow-up periods and relatively few events for each individual outcome may limit the statistical power of this study. Fourth, the black box nature of the ML models limits the interpretability, which could be mitigated by utilizing the SHAP method in this study to elucidate the influence and importance of individual variables. Fifth, this study did not include the detailed assessment of HCM disease severity or risk stratification for HCM. Sixth, our data did not include parameters from advanced imaging techniques or functional tests, such as LV global longitudinal strain, LA reservoir strain, cardiac magnetic resonance imaging, treadmill test, or Holter monitoring. However, advanced imaging technique (eg, cardiac magnetic resonance) has its own limitations such as long acquisition time, difficult accessibility, and challenges in real-time evaluation. It is important that we incorporated the most commonly used clinical and echocardiographic parameters to pursue high adaptability of the established model.

Conclusions

Applying ML approaches to models incorporating a wide variety of phenotypes improved the ability to discriminate major cardiovascular events and provided features of high importance for each outcome in HCM patients. If clinically applicable, it can be helpful to early recognize and manage high-risk subsets of HCM patients.

Perspectives

COMPETENCY IN PATIENT CARE AND PROCEDURAL SKILLS: We developed ML models for discriminating all-cause death, admission for heart failure, and stroke in HCM patients using data from 2 referral centers. ML-based discriminative models enhanced the accuracy of predicting major cardiovascular events in HCM patients, identifying important features for each outcome. Early recognition and management of high-risk HCM subsets could be facilitated by applying these models.

TRANSLATIONAL OUTLOOk: The findings of this study, which were derived from Asian data, underscore the importance of validating the results in cohorts representing diverse ethnic backgrounds, while also emphasize the necessity for refining the discriminative model by incorporating data from advanced cardiovascular imaging or functional test.

Funding Support and Author Disclosures

This study was supported by a research grant from the Seoul National University Research fund (no. 800-20210548). The funding source did not have any involvement on the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication. The authors have reported that they have no relationships relevant to the contents of this paper to disclose.

Abbreviations and Acronyms

AF

atrial fibrillation

AUROC

area under the receiver operating characteristic curve

HCM

hypertrophic cardiomyopathy

HF

heart failure

HF-adm

heart failure admission

LA

left atrium

LR

logistic regression

LV

left ventricle

LVESVi

left ventricle end-systolic volume index

ML

machine learning

SCD

sudden cardiac death

SHAP

SHapley Additive exPlanations

References

  • 1. Moon I., Lee S.Y., Kim H.K., et al. "Trends of the prevalence and incidence of hypertrophic cardiomyopathy in Korea: a nationwide population-based cohort study". PLoS One . 2020;15:e0227012.

    Google Scholar
  • 2. O'Mahony C., Jichi F., Pavlou M., et al. "A novel clinical risk prediction model for sudden cardiac death in hypertrophic cardiomyopathy (HCM risk-SCD)". Eur Heart J . 2014;35:2010-2020.

    CrossrefMedlineGoogle Scholar
  • 3. Lee H.J., Kim J., Chang S.A., Kim Y.J., Kim H.K., Lee S.C. "Major clinical issues in hypertrophic cardiomyopathy". Korean Circ J . 2022;52:563-575.

    CrossrefGoogle Scholar
  • 4. Kwon S., Kim H.K., Kim B., et al. "Comparison of mortality and cause of death between adults with and without hypertrophic cardiomyopathy". Sci Rep . 2022;12:6386.

    Google Scholar
  • 5. Lee H.J., Kim H.K., Lee S.C., et al. "Supplementary role of left ventricular global longitudinal strain for predicting sudden cardiac death in hypertrophic cardiomyopathy". Eur Heart J Cardiovasc Imaging . 2022;23:1108-1116.

    CrossrefMedlineGoogle Scholar
  • 6. Canepa M., Fumagalli C., Tini G., et al. "Temporal trend of age at diagnosis in hypertrophic cardiomyopathy: an analysis of the International Sarcomeric Human Cardiomyopathy Registry". Circ Heart Fail . 2020;13:e007230.

    CrossrefMedlineGoogle Scholar
  • 7. Choi Y.J., Kim H.K., Hwang I.C., et al. "Prognosis of patients with hypertrophic cardiomyopathy and low-normal left ventricular ejection fraction". Heart . 2023;109:10: 771-778. https://doi.org/10.1136/heartjnl-2022-321853.

    CrossrefGoogle Scholar
  • 8. Choi Y.J., Kim B., Rhee T.M., et al. "Augmented risk of ischemic stroke in hypertrophic cardiomyopathy patients without documented atrial fibrillation". Sci Rep . 2022;12:15785.

    Google Scholar
  • 9. Lee H.J., Kim H.K., Kim M., et al. "Clinical impact of atrial fibrillation in a nationwide cohort of hypertrophic cardiomyopathy patients". Ann Transl Med . 2020;8:1386.

    Google Scholar
  • 10. Ambale-Venkatesh B., Yang X., Wu C.O., et al. "Cardiovascular event prediction by machine learning: the Multi-Ethnic Study of Atherosclerosis". Circ Res . 2017;121:1092-1101.

    CrossrefMedlineGoogle Scholar
  • 11. Shah S.J., Katz D.H., Selvaraj S., et al. "Phenomapping for novel classification of heart failure with preserved ejection fraction". Circulation . 2015;131:269-279.

    CrossrefMedlineGoogle Scholar
  • 12. Kochav S.M., Raita Y., Fifer M.A., et al. "Predicting the development of adverse cardiac events in patients with hypertrophic cardiomyopathy using machine learning". Int J Cardiol . 2021;327:117-124.

    CrossrefMedlineGoogle Scholar
  • 13. Fahmy A.S., Rowin E.J., Manning W.J., Maron M.S., Nezafat R. "Machine learning for predicting heart failure progression in hypertrophic cardiomyopathy". Front Cardiovasc Med . 2021;8:647857.

    Google Scholar
  • 14. Smole T., Zunkovic B., Piculin M., et al. "A machine learning-based risk stratification model for ventricular tachycardia and heart failure in hypertrophic cardiomyopathy". Comput Biol Med . 2021;135:104648.

    Google Scholar
  • 15. Lang R.M., Badano L.P., Mor-Avi V., et al. "Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging". J Am Soc Echocardiogr . 2015;28:1-39.e14.

    CrossrefMedlineGoogle Scholar
  • 16. Ponikowski P., Voors A.A., Anker S.D., et al. "2016 ESC guidelines for the diagnosis and treatment of acute and chronic heart failure: the Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC)". Eur Heart J . 2016;37:2129-2200.

    CrossrefMedlineGoogle Scholar
  • 17. Pfob A., Lu S.C., Sidey-Gibbons C. "Machine learning in medicine: a practical introduction to techniques for data pre-processing, hyperparameter tuning, and model comparison". BMC Med Res Methodol . 2022;22:282.

    Google Scholar
  • 18. Sidey-Gibbons J.A.M., Sidey-Gibbons C.J. "Machine learning in medicine: a practical introduction". BMC Med Res Methodol . 2019;19:64.

    Google Scholar
  • 19. Lundberg S.M., Erion G., Chen H., et al. "From local explanations to global understanding with explainable AI for trees". Nat Mach Intell . 2020;2:56-67.

    CrossrefMedlineGoogle Scholar
  • 20. Wang K., Tian J., Zheng C., et al. "Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP". Comput Biol Med . 2021;137:104813.

    Google Scholar
  • 21. Maron B.J., Rowin E.J., Casey S.A., et al. "Risk stratification and outcome of patients with hypertrophic cardiomyopathy >=60 years of age". Circulation . 2013;127:585-593.

    CrossrefMedlineGoogle Scholar
  • 22. Krittanawong C., Johnson K.W., Rosenson R.S., et al. "Deep learning for cardiovascular medicine: a practical primer". Eur Heart J . 2019;40:2058-2073.

    CrossrefMedlineGoogle Scholar
  • 23. "GBD 2019 Stroke Collaborators. Global, regional, and national burden of stroke and its risk factors, 1990-2019: a systematic analysis for the Global Burden of Disease Study 2019". Lancet Neurol . 2021;20:10: 795-820. https://doi.org/10.1016/S1474-4422(21)00252-0.

    Google Scholar
  • 24. Petch J., Di S., Nelson W. "Opening the black box: the promise and limitations of explainable machine learning in cardiology". Can J Cardiol . 2022;38:204-213.

    Google Scholar
  • 25. Nagueh S.F., Smiseth O.A., Appleton C.P., et al. "Recommendations for the evaluation of left ventricular diastolic function by echocardiography: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging". J Am Soc Echocardiogr . 2016;29:277-314.

    CrossrefMedlineGoogle Scholar
  • 26. Lee H.J., Kim H.K., Rhee T.M., et al. "Left atrial reservoir strain-based left ventricular diastolic function grading and incident heart failure in hypertrophic cardiomyopathy". Circ Cardiovasc Imaging . 2022;15:e013556.

    CrossrefGoogle Scholar
  • 27. Unger T., Borghi C., Charchar F., et al. "2020 International Society of Hypertension global hypertension practice guidelines". Hypertension . 2020;75:1334-1357.

    CrossrefMedlineGoogle Scholar
  • 28. Higuchi S., Minami Y., Shoda M., et al. "Effect of renal dysfunction on risk of sudden cardiac death in patients with hypertrophic cardiomyopathy". Am J Cardiol . 2021;144:131-136.

    Google Scholar
  • 29. Gonzalez A., Ravassa S., Lopez B., et al. "Myocardial remodeling in hypertension". Hypertension . 2018;72:549-558.

    CrossrefMedlineGoogle Scholar
  • 30. Haruki S., Minami Y., Hagiwara N. "Stroke and embolic events in hypertrophic cardiomyopathy: risk stratification in patients without atrial fibrillation". Stroke . 2016;47:936-942.

    CrossrefGoogle Scholar
  • 31. Kim E.K., Lee S.C., Hwang J.W., et al. "Differences in apical and non-apical types of hypertrophic cardiomyopathy: a prospective analysis of clinical, echocardiographic, and cardiac magnetic resonance findings and outcome from 350 patients". Eur Heart J Cardiovasc Imaging . 2016;17:678-686.

    CrossrefMedlineGoogle Scholar
  • 32. Park K.M., Im S.I., Kim E.K., et al. "Atrial fibrillation in hypertrophic cardiomyopathy: is the extent of septal hypertrophy important?"PLoS One . 2016;11:e0156410.

    Google Scholar

Footnotes

The authors attest they are in compliance with human studies committees and animal welfare regulations of the authors’ institutions and Food and Drug Administration guidelines, including patient consent where appropriate. For more information, visit the Author Center.