基于多种机器学习算法构建并验证维持性血液透析患者全因死亡风险预测模型
作者:
作者单位:

1. 无锡市第二人民医院血液透析中心,江苏 无锡 214000 ;2. 建湖县人民医院血液透析中心,江苏 盐城 224700 ;3. 江南大学 附属医院血液透析中心,江苏 无锡 214122

作者简介:

通讯作者:

中图分类号:

R692.5

基金项目:


Development and validation of an all-cause mortality risk prediction model utilizing multiple machine learning algorithms for maintenance hemodialysis patients
Author:
Affiliation:

1. Hemodialysis Center, Wuxi Second People’s Hospital, Wuxi 214000 ;2. Hemodialysis Center, Jianhu County People’s Hospital, Yancheng 224700 ;3. Hemodialysis Center, Affiliated Hospital of Jiangnan University, Wuxi 214122 , China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目的:基于不同机器算法构建并验证维持性血液透析(maintenance hemodialysis,MHD)患者全因死亡预测模型。方法:收集江苏省4所医院血液透析中心共694例患者的临床资料,其中无锡市3所三级甲等医院血液透析中心在2017年1月—2023年12月有591例MHD患者,盐城市1所二级甲等医院血液透析中心在2024年1—12月有103例MHD患者。将591例研究对象以7∶3的比例随机分为训练集(n=414)和验证集(n=177),训练集用于构建模型,验证集用于内部验证;将103例研究对象作为测试集,用于外部验证。通过最小绝对值选择与收缩算子(least absolute selection and shrinkage operator,LASSO)方法筛选预测因子,采用10种机器学习算法构建MHD患者全因死亡风险预测模型,绘制受试者工作特征(receiver operating characteristic,ROC)曲线评估模型预测效果。运用校准曲线评估模型预测概率的准确性,决策曲线分析(decision curve analysis,DCA)评估模型在不同决策阈值下的临床净收益。外部验证采用ROC曲线下面积(area under curve,AUC)评价最优模型泛化能力,通过Shapley加法解释(Shapley additive exPlanations,SHAP)对各变量因素进行重要性排序。结果:MHD患者全因死亡发生率为42.6%(252/591),10种机器学习算法模型中,支持向量机(support vector machine,SVM)模型的预测效能最优,ROC曲线下AUC为0.928,灵敏度为89.47%,精确度为0.919,校准曲线、DCA评价预测模型一致性及获益性良好,Brier分数为0.089,表明模型在内部数据集上的预测误差较小,校准性能良好。外部验证AUC为0.835,说明模型具有较强的泛化能力。SHAP图显示全因死亡发生的影响因素重要性排序分别为独自居住、带涤纶套中心静脉导管(tunneled cuffed catheter,TCC)、前白蛋白、白蛋白、查尔森合并症指数(Charlson comorbidity index,CCI)评分、全段甲状旁腺激素(intact parathyroid hormone total,iPTH)<300 pg/mL、年龄、初中及以下学历、尿素氮肌酐比值、糖尿病肾病、大专及以上学历、性别。结论:基于SVM构建的维持性血液透析患者全因死亡预测模型具有良好的预测效果,有助于识别高风险患者,为临床决策及干预提供依据。

    Abstract:

    Objective:To construct and validate prediction models for all - cause mortality in maintenance hemodialysis(MHD)patients using diverse machine learning algorithms. Methods:Clinical data were collected from 694 patients across four hemodialysis centers in Jiangsu Province,including 591 MHD patients from three tertiary Grade A hospitals in Wuxi City(January 2017-December2023)and 103 patients from one secondary Grade A hospital in Yancheng City(January-December 2024). The 591 cases were randomly divided into a training set(n=414)and a validation set(n=177)at a 7∶3 ratio for model development and internal validation,while the remaining 103 cases served as a test set for external validation. Predictors were selected via the least absolute selection and shrinkage operator(LASSO)method. Patients were randomly divided into training(n=414)and validation(n=177)sets. Ten machine learning algorithms were employed to develop risk prediction models. Receiver operating characteristic(ROC)curves were plotted to evaluate predictive performance. The calibration accuracy of model-predicted probabilities was assessed using calibration curves,while decision curve analysis(DCA)was employed to quantify the clinical net benefit across varying decision thresholds. External validation utilized the area under the curve(AUC)to assess the generalization capability of the optimal model. Shapley Additive exPlanations(SHAP)were applied to rank variable importance. Results:The all-cause mortality rate was 42.6%(252/591). Among the 10 models,the support vector machine(SVM)exhibited optimal performance,the AUC was 0.928,the sensitivity was 89.47%,and the accuracy was 0.919,and the evaluation of calibration curve and DCA showed that the consistency and benefit of the model are still good,the Brier score of 0.089 indicates that the model demonstrates low predictive error and favorable calibration performance on the internal validation dataset,suggesting its reliability in probabilistic forecasting. External validation yielded an AUC of 0.835,indicating robust generalization capability of the model. The SHAP plot showed that the importance ranking of the influencing factors for all-cause mortality was living alone,tunneled cuffed catheter(TCC),prealbumin,albumin,Charlson comorbidity index(CCI)score,iPTH<300 pg/mL,age,junior high school education or lower,blood urea nitrogen -to - creatinine ratio,diabetic nephropathy,college degree or higher education and sex. Conclusion:The SVM-based prediction model demonstrates robust performance in forecastingall-cause mortality among MHD patients,facilitating early identification of high-risk individuals and supporting clinical decision-making.

    参考文献
    相似文献
    引证文献
引用本文

王娇,周怡君,孙文娟,周静怡,王依娜.基于多种机器学习算法构建并验证维持性血液透析患者全因死亡风险预测模型[J].南京医科大学学报(自然科学版),2026,(2):247-255

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-02-15
  • 出版日期:
关闭