Densely Connected Swin-UNet for Multiscale Information Aggregation in Medical Image Segmentation

Ziyang Wang, Meiwen Su, Jian-Qing Zheng, Yang Liu

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Lecture 10 Oct 2023

Image semantic segmentation is a dense prediction task in computer vision that is dominated by deep learning techniques in recent years. A U-shape network, which is a symmetric encoder-decoder end-to-end Convolutional Neural Network (CNN) with skip connections, has shown promising performance. Aiming to process the multiscale feature information efficiently, we propose a new Densely Connected Swin-UNet(DCS-UNet) with multiscale information aggregation for medical image segmentation. Firstly, inspired by the of Vision Transformer(ViT) to model long-range dependencies via self-attention, this work proposes the use of fully ViT-based network blocks with a shift-window approach, resulting in a purely self-attention-based U-shape segmentation network. The relevant layers including feature sampling and image tokenization are re-designed to align with the ViT fashion. Secondly, a full-scale deep supervision scheme is developed to process the aggregated feature map with various resolutions generated by different levels of decoders. Thirdly, dense skip connections are proposed that allow the semantic feature information to be thoroughly transferred from different levels of encoders to lower level decoders. Our proposed method is validated on a public benchmark MRI Cardiac segmentation data set with comprehensive validation metrics showing competitive performance against other variant encoder-decoder networks. The code will be publicly available at GitHub.

Tags:

semantic segmentation

UNet

vision transformer

Densely Connected Swin-UNet for Multiscale Information Aggregation in Medical Image Segmentation

Ziyang Wang, Meiwen Su, Jian-Qing Zheng, Yang Liu

More Like This

HIERARCHICAL MULTI-TASK LEARNING VIA TASK AFFINITY GROUPINGS

SINGLE-DOMAIN GENERALIZATION FOR SEMANTIC SEGMENTATION VIA DUAL-LEVEL DOMAIN AUGMENTATION

IMAGE INPAINTING BY MSCSWIN TRANSFORMER ADVERSARIAL AUTOENCODER

Join the IEEE Signal Processing Society