Abstract:Objective: This paper explores the application of the extended residual attention network method to construct multi-modal fusion images of the mandibular joint area, and provides inspiratory analysis to improve the comprehensive diagnosis and treatment capabilities of multi-modal fusion images of the oral and mandibular joint areas. Methods: Use the dilated residual attention network to extract image features of MRI and CBCT, use the "softmax weighting strategy" to fuse the features, and then use the image reconstruction module to fuse the corresponding images of the two modalities together. Results: The final fusion image can show the shape of condylar cortical bone, condylar medullary bone, condylar attached muscles and articular disc. In terms of the two evaluation indicators of peak signal-to-noise ratio and structural similarity, it performed well, with the peak signal-to-noise ratio range being [10,15] and the structural similarity range being [0.4,0.6]. Conclusion: This method can achieve real-time image fusion, the final fused image can reflect clear anatomical morphological features, avoid multi-modal image switching, and provide effective guidance for dental experts in preoperative and postoperative clinical diagnosis.