TransWnet: Integrating Transformers into CNNs via Row and Column Attention for Abdominal Multi-organ Segmentation
Yazhen Xie (Xiangtan University); Yanglin Huang (Xiangtan University); Yuan Zhang (Xiangtan University); Xuanya Li (Baidu); Xiongjun Ye (Xiangtan University); Kai Hu (Xiangtan University)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Learning how to model global relationships and extract local details is crucial in improving the performance of multi-organ segmentation. Most existing U-shaped structure methods use feature fusion to address these two challenges, but still lack the ability to balance capturing global relationships and local details. To address these issues, we propose a novel multi-organ segmentation framework called TransWnet to mine global relationships and local details from both intra- and inter-scale perspectives. To achieve this, we innovatively design a Row and Column Swin Transformer (RCST) module that can efficiently capture global contextual features and construct local information. Specifically, we design a parallel structure of Row and Column Attention to model the global relationships of multi-scale encoded features, and further mine local information from the global relationships through a local window mechanism. Extensive experiments on the Synapse dataset show that our method outperforms state-of-the-art approaches and achieves accurate segmentation of abdominal multi-organs.