Abstract:OBJECTIVE: To predict the risks affecting the ability of daily living (ADL) of stroke patients based on machine learning algorithms, and to provide a reference for decision-making on ADL of stroke patients. METHODS: 423 stroke patients treated in the Rehabilitation Medicine Center of Jiangsu Provincial People's Hospital from January 2015 to February 2019 were retrospectively analyzed. According to the Barthel index rating scale (BI), the patients were divided into the better ADL group (BI ≥ 60 points) and the worse ADL group (BI < 60 points), and the data were preprocessed. Covariate diagnosis and least absolute shrinkage and selection operator (LASSO) were used to screen the characteristic variables. Five machine learning algorithms, logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost) and K nearest neighbor (KNN), were selected for predictive modeling, and after ten-fold cross-validation, the models were comprehensively evaluated using ROC curve, AUC, PR curve, PRAUC, accuracy, sensitivity, and specificity, respectively, and the introduction of the Shapley additive interpretation (SHAP) to interpretable the optimal machine learning model. RESULTS: After LASSO regression analysis, a total of 16 feature variables were used to construct the machine learning model.The RF model had the highest AUC (0.96), PRAUC (0.58/0.58), accuracy (0.80), sensitivity (0.75), and specificity (0.97).The interpretive analysis of the SHAP model showed that the top 5 contributing to the ability to perform activities of daily living (ADL) characteristics, Brunnstrom staging (lower extremity) had the most significant effect, followed by Brunnstrom staging (upper extremity), D-dimer, serum albumin level, and age. CONCLUSION: The Random Forest Model is the most effective in predicting the ability of stroke patients to care for themselves in daily life, and it can be used as a reference for the decision-making of daily life care for stroke patients.