VISION TRANSFORMER WITH PROGRESSIVE TOKENIZATION FOR CT METAL ARTIFACT REDUCTION
Songwei Zheng (Fuzhou University); Dong Zhang (Fuzhou University); ChunYan Yu (Fuzhou University); Danhong Zhu (Fuzhou University); Longlong Zhu (Fuzhou University); Hao Liu (Fuzhou University); Zhongzheng Huang (Fuzhou University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
High-quality Computed Tomography(CT) plays a vital role in clinical diagnosis, but the presence of metallic implants will introduce severe metal artifacts on CT images and obstruct doctors’ decision-making. Many prior researches on Metal Artifact Reduction(MAR) are based on Convolutional Neural Network(CNN). Recently, Transformer has demonstrated phenomenal potential in computer vision. Also, transformer-based methods have been harnessed in CT image denoising. Nevertheless, these methods have been little explored in MAR. To fill the gap, we put forth, to the best of our knowledge, the first transformer-based architecture for MAR. Our method relies on a standard Vision Transformer(ViT). Furthermore, we tap into the progressive tokenization to refrain from the simple tokenization of ViT which gives rise to inability to model the local anatomical information. Additionally, for the sake of facilitating the interaction among tokens, we take advantage of cyclic shift from Swin Transformer. Finally, many experiment results reveal that the transformer-based technique is superior to those on the basis of CNN to some degree.