An intelligent detection method of key frame in echocardiography
Author:
Affiliation:
Fund Project:
摘要
|
图/表
|
访问统计
|
参考文献
|
相似文献
|
引证文献
|
资源附件
|
文章评论
摘要:
目的:探讨基于深度学习(deep learning,DL)的ResNet+VST模型在超声心动图关键帧智能检测方面的可行性。方法:选取南京大学医学院附属鼓楼医院超声医学科采集的663个动态图像含心尖二腔(apical two chambers,A2C)、心尖三腔 (apical three chambers,A3C)与心尖四腔(apical four chambers,A4C)3类临床检查常用切面以及EchoNet-Dynamic公开数据集中 280个A4C切面动态图像,分别建立南京鼓楼医院数据集与EchoNet-Dynamic-Tiny数据集,各类别图像按4∶1方式划分为训练集和测试集,进行ResNet+VST模型的训练以及与多种关键帧检测模型的性能对比,验证ResNet+VST模型的先进性。结果: ResNet+VST模型能够更准确地检测心脏舒张末期(end-diastole,ED)与收缩末期(end-systole,ES)图像帧。在南京鼓楼医院数据集上,模型对A2C、A3C 和A4C 切面数据的ED 预测帧差分别为1.52±1.09、1.62±1.43、1.27±1.17,ES 预测帧差分别为1.56± 1.16、1.62±1.43、1.45±1.38;在EchoNet-Dynamic-Tiny数据集上,模型对A4C切面数据的ED预测帧差为1.62±1.26,ES预测帧差为1.71±1.18,优于现有相关研究。此外,ResNet+VST模型有良好的实时性表现,在南京鼓楼医院数据集与EchoNet-Dynamic- Tiny数据集上,基于GTX 3090Ti GPU对16帧的超声序列片段推理的平均耗时分别为21 ms与10 ms,优于以长短期记忆单元 (long short-term memory,LSTM)进行时序建模的相关研究,基本满足临床即时处理的需求。结论:本研究提出的ResNet+VST 模型在超声心动图关键帧检测的准确性、实时性方面,相较于现有研究有更出色的表现,该模型原则上可推广到任何超声切面,有辅助超声医师提升诊断效率的潜力。
Abstract:
Objective:To explore the feasibility of using ResNet+VST model based on deep learning(DL)for intelligent detection of key frames in echocardiography. Methods:A total of 663 dynamic images including apical two chambers(A2C),apical three chambers (A3C),and apical four chambers(A4C),which are commonly used clinical examination views,were collected from the Department of Ultrasound Medicine at Drum Tower Hospital,Nanjing University Medical School. Additionally,280 dynamic A4C images from the EchoNet-Dynamic public dataset were selected. Two datasets were established:the Nanjing Drum Tower Hospital dataset and the EchoNet-Dynamic-Tiny dataset. The images in each category were divided into training set and testing sets in a 4:1 ratio. The ResNet+ VST model was trained and its performance was compared with other key frame detection models to verify the its superiority. Results: The ResNet+VST model can detect the end-diastolic(ED)and end-systolic(ES)image frames of the heart more accurately. On the Nanjing Drum Tower Hospital dataset,the model achieved ED frame prediction differences of 1.52±1.09,1.62±1.43,and 1.27±1.17 for A2C,A3C,and A4C views,respectively,and ES frame prediction differences of 1.56±1.16,1.62±1.43,and 1.45±1.38,respectively. On the EchoNet-Dynamic-Tiny dataset,the model achieved an ED frame prediction difference of 1.62±1.26 and an ES frame prediction difference of 1.71 ± 1.18,outperforming existing related studies. Furthermore,the ResNet + VST model exhibited good real-time performance,with average inference times of 21 ms and 10 ms for 16-frame ultrasound sequences on the Nanjing Drum Tower Hospital dataset and the EchoNet -Dynamic -Tiny dataset,respectively,using the GTX 3090Ti GPU. This performance was superior to related studies that used long short -term memory(LSTM)for temporal modeling and met the requirements for clinical real -time processing. Conclusion:The proposed ResNet + VST model demonstrates superior accuracy and real-time performance in the detection of key frames in echocardiography compared to existing research. In principle,this model can be applied to any ultrasound view and has the potential to assist ultrasound physicians in improving diagnostic efficiency.