Objective:To explore the feasibility of constructing multimodal fused images of the temporomandibular joint area using the dilated residual attention network method,and to provide a feasibility analysis for improving the comprehensive diagnostic and therapeutic capabilities under multimodal fusion imaging of the oral temporomandibular joint. Methods:The dilated residual attention network was used to extract image features of MR and CBCT,and a“Softmax weighting strategy”to fuse the features. Subsequently,the corresponding images of the two modalities were fused together through an image reconstruction module. Results:The fused images could present the morphology of condylar cortical bone,condylar medullary bone,condylar attached muscles and articular disc. The fused images performed well in terms of peak signal -to - noise ratio and structural similarity index,with peak signal -to - noise ratio ranging from 10 to 15 and structural similarity index ranging from 0.4 to 0.6. Conclusion:This method can achieve real -time image fusion,the final fused image can reflect clear anatomical morphological features,thus avoiding the need for switching between multimodal images and providing effective guidance for dental experts in preoperative and postoperative clinical diagnosis.