中国儿童保健杂志 ›› 2023, Vol. 31 ›› Issue (3): 241-245.DOI: 10.11852/zgetbjzz2022-1433

• 科研论著 • 上一篇    下一篇

孤独症谱系障碍儿童合并智力功能缺陷的GBM预测模型构建与评价

宋超1, 胡莉菲1, 吴玲玲1, 江忠权2   

  1. 1.浙江大学医学院附属儿童医院发育行为科/国家儿童健康与疾病临床医学研究中心,浙江 杭州 310051;
    2.兰州大学公共卫生学院
  • 收稿日期:2022-11-29 修回日期:2022-12-29 发布日期:2023-02-28 出版日期:2023-03-01
  • 通讯作者: 吴玲玲,E-mail: chdbpwll@zju.edu.cn
  • 作者简介:宋超(1987-),男,江西人,主治医师,博士学位,主要研究方向为儿童发育行为问题。
  • 基金资助:
    浙江省自然科学基金(LGF20H090015)

Construction and evaluation of a Gradient Boosting Machines prediction model for children withautism spectrum disorder complicated with intellectual impairment

SONG Chao1, HU Lifei1, WU Lingling1, JIANG Zhongquan2   

  1. 1. Department of Developmental Behavioral Pediatrics, Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, Hangzhou,Zhejiang 310051,China;
    2. School of Public Health, Lanzhou University
  • Received:2022-11-29 Revised:2022-12-29 Online:2023-03-01 Published:2023-02-28
  • Contact: WU Lingling, E-mail: chdbpwll@zju.edu.cn

摘要: 目的 构建并评价孤独症谱系障碍(ASD)儿童合并智力功能缺陷的GBM预测模型,以期为该群体的早期筛查提供新视角。方法 2017年1月—2021年12月,选取浙江大学医学院附属儿童医院明确诊断为ASD的241名儿童纳入分析。本研究使用社会人口学与行为观察数据训练了GBM的预测模型,并与传统的Logistic回归(LR)对比。超参数调整使用网格搜索与十折交叉验证,特征选择使用交叉验证的LASSO方法,模型性能评价使用区分度与校准度。可解释性分析采用SHAP方法。结果 在241例ASD儿童中,98例(40.66%)合并智力功能缺陷。LASSO特征选择筛选出语言能力、母亲学历、行为观察时的年龄、刻板语言、指物和(或)姿势、主动表达社交意向的品质、不寻常感官兴趣、重复行为或刻板兴趣共计8个预测变量。特征选择前后的LR和GBM模型都能较好区分ASD儿童是否合并智力功能缺陷。特征选择后的GBM模型曲线下面积(AUC)(0.870, 95%CI:0.749~0.989)与传统LR(0.851, 95%CI: 0.704~0.921)接近;校准度方面,除全变量的LR校准度较差,其他模型均观测到了较好的校准度。特征重要性方面,语言能力是预测ASD儿童合并智力功能缺陷的第一重要特征。结论 特征选择后搭建的基于GBM的ASD儿童合并智力功能缺陷预测模型具有良好性能,具有一定的临床应用价值。

关键词: 孤独症谱系障碍, 智力功能缺陷, 机器学习, 预测模型

Abstract: Objective To construct and evaluate a Gradient Boosting Machines(GBM) prediction model for children with autism spectrum disorder (ASD) and comorbid intellectual impairment, so as to provide a new perspective for early screening of this population. Method From January 2017 to December 2021, 241 children with a clear diagnosis of ASD in the Children's Hospital, Zhejiang University School of Medicine were included in the analysis. The prediction model of GBM was trained using sociodemographic and behavioral observation data and compared with traditional Logistic regression (LR) in this study. Hyperparameter adjustment was performed using grid search with ten-fold cross-validation, feature selection methods were performed using cross-validation LASSO, and the performance of the model was evaluated using discrimination and calibration. Explainability analysis was evaluated using SHapley Additive exPlanation (SHAP). Results The sample totaled 241 children with ASD, of whom 98 (40.66%) had intellectual impairments. Eight predictor variables were screened by the LASSO method, including language ability, mother's education attainment, age at the time of behavioral observation, stereotyped speech, pointing/gestures, social quality, unusual sensory interest and repetitive stereotyped behaviors. Both LR and GBM models before and after feature selection were better at distinguishing whether children with ASD had combined intellectual impairment. The area under curve (AUC) of the GBM model after feature selection (0.870, 95%CI: 0.749 - 0.989) was close to that of the conventional LR (0.851, 95%CI: 0.704 - 0.921). Regarding calibration, good calibration was observed for all models, except for the poor calibration of the full variable LR. In terms of feature importance, language ability contributed the most to the prediction of combined intellectual functioning deficits in children with ASD. Conclusion The prediction model for children with ASD complicated with intellectual impairment constructed by the feature selection method of LASSO and the GBM model has good performance and some clinical value.

Key words: autism spectrum disorder, intellectual impairment, machine learning, prediction model

中图分类号: