Deep Features Fusion With Mutual Attention Transformer For Skin Lesion Diagnosis

Li Zhou, Yan Luo

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:05:30

22 Sep 2021

Early skin lesion diagnosis is crucial to prevent skin cancer, and deep learning (DL) based methods are well exploited to support dermatologists' diagnosis. The data for the diagnosis tasks include dermoscopic lesion images and textual information. It is a challenge to learn features from the multimodal data to improve diagnostic quality. Inspired by the vision and language integration models in Visual Question Answer (VQA), we present an end-to-end neural network model for skin lesion diagnosis using both images and textual information simultaneously. Specifically, we fine-grained features from the two modalities (image and text) of the dataset by the pre-trained DL models. We propose a novel approach named Mutual Attention Transformer (MAT), which consists of self-attention blocks and guided-attention blocks, to enable the interactions between the features from both modalities concurrently. We then develop a fusion mechanism to integrate the represented features before the final classification output layer. The experimental results on the HAM10000 dataset demonstrate that the proposed method outperforms the state-of-art methods for skin lesion diagnosis.

Tags:

signal processing society

IEEE icip 2021

september 19-22

virtual conference

2021

sps

virtual conference icip 2021

icip 2021

Deep Features Fusion With Mutual Attention Transformer For Skin Lesion Diagnosis

Li Zhou, Yan Luo

Value-Added Bundle(s) Including this Product

ICIP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Bundle: 2024 IEEE SustainTech Leadership Forum

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Join the IEEE Signal Processing Society