基于可解释机器学习构建脑卒中患者日常生活自理能力风险预测模型
DOI:
作者:
作者单位:

南京医科大学第一附属医院

作者简介:

通讯作者:

中图分类号:

基金项目:

基于JNK信号-自噬系统调控的针刺延长脑梗死溶栓时间窗的研究


Constructing a risk prediction model for stroke patients' ability to care for themselves in daily life based on interpretable machine learning
Author:
Affiliation:

The First Affiliated Hospital of Nanjing Medical University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目的:基于机器学习算法对影响脑卒中患者日常生活自理能力(activities of daily living,ADL)的风险进行预测,为脑卒中患者的日常生活自理能力决策提供参考。方法:回顾性分析2015年1月至2019年2月在江苏省人民医院康复医学中心治疗的423例脑卒中患者。根据Barthel指数评定量表(Barthel index, BI),将患者分为ADL较好组(BI≥60分)和ADL较差组 (BI<60分),并进行数据预处理。采用共线性诊断及最小绝对收缩和选择算子(LASSO)筛选特征变量。选择逻辑回归(LR)、支持向量机(SVM)、随机森林(RF)、极限梯度提升(XGBoost)及K最近邻(KNN)五种机器学习算法进行预测建模,十倍交叉验证后,使用ROC曲线、AUC、PR曲线、PRAUC、准确率、灵敏度、特异度分别对模型进行综合评估,引入Shapley加性解释(SHAP)对最优机器学习模型进行可解释化处理。结果:经LASSO回归分析后,共有16个特征变量用于构建机器学习模型。RF模型具有最高的AUC(0.96)、PRAUC(0.58/0.58)、准确率(0.80)、灵敏度(0.75),特异度(0.97)。SHAP模型解释性分析显示,对日常生活活动能力贡献度前5的特征中,Brunnstrom分期(下肢)的影响最为显著,其次是Brunnstrom分期(上肢)、D-二聚体、血清白蛋白水平及年龄。结论:随机森林模型预测卒中患者日常生活自理能力的效能最优,可为卒中患者日常生活自理决策提供参考。

    Abstract:

    OBJECTIVE: To predict the risks affecting the ability of daily living (ADL) of stroke patients based on machine learning algorithms, and to provide a reference for decision-making on ADL of stroke patients. METHODS: 423 stroke patients treated in the Rehabilitation Medicine Center of Jiangsu Provincial People's Hospital from January 2015 to February 2019 were retrospectively analyzed. According to the Barthel index rating scale (BI), the patients were divided into the better ADL group (BI ≥ 60 points) and the worse ADL group (BI < 60 points), and the data were preprocessed. Covariate diagnosis and least absolute shrinkage and selection operator (LASSO) were used to screen the characteristic variables. Five machine learning algorithms, logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost) and K nearest neighbor (KNN), were selected for predictive modeling, and after ten-fold cross-validation, the models were comprehensively evaluated using ROC curve, AUC, PR curve, PRAUC, accuracy, sensitivity, and specificity, respectively, and the introduction of the Shapley additive interpretation (SHAP) to interpretable the optimal machine learning model. RESULTS: After LASSO regression analysis, a total of 16 feature variables were used to construct the machine learning model.The RF model had the highest AUC (0.96), PRAUC (0.58/0.58), accuracy (0.80), sensitivity (0.75), and specificity (0.97).The interpretive analysis of the SHAP model showed that the top 5 contributing to the ability to perform activities of daily living (ADL) characteristics, Brunnstrom staging (lower extremity) had the most significant effect, followed by Brunnstrom staging (upper extremity), D-dimer, serum albumin level, and age. CONCLUSION: The Random Forest Model is the most effective in predicting the ability of stroke patients to care for themselves in daily life, and it can be used as a reference for the decision-making of daily life care for stroke patients.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-01-04
  • 最后修改日期:2024-04-02
  • 录用日期:2024-05-14
  • 在线发布日期:
  • 出版日期:
关闭