Advit: Vision Transformer On Multi-Modality Pet Images For Alzheimer Disease Diagnosis
Xin Xing, Gongbo Liang, Yu Zhang, Subash Khanal, Ai-Ling Lin, Nathan Jacobs
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:04:40
We present a new model trained on multi-modalities of Positron Emission Tomography images (PET-AV45 and PET-FDG) for Alzheimer's Disease (AD) diagnosis. Unlike the conventional methods using multi-modal 3D/2D CNN architecture, our design replaces the Convolutional Neural Network (CNN) by ViT. Considering the high computation cost of 3D images, we firstly employ a 3D-to-2D operation to project the 3D PET images into 2D fusion images. Then, we forward the fused multi-modal 2D images to a parallel ViT model for feature extraction, followed by classification for AD diagnosis. For evaluation, we use PET images from ADNI. The proposed model outperforms several strong baseline models in our experiments and achieves 0.91 accuracy and 0.95 AUC.