Skip to main content

DEMENTIA DETECTION BY FUSING SPEECH AND EYE-TRACKING REPRESENTATION

Zhengyan Sheng, Zhiqiang Guo, Xin Li, Yunxia Li, Zhenhua Ling

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:09:20
08 May 2022

This paper proposes a method of detecting dementia from the simultaneous speech and eye-tracking recordings of subjects in a picture description task. First, automatic speech recognition (ASR) and regional picture recognition (RPR) models are built to extract content-related bottleneck (BN) features for both speech and eye-tracking inputs. Then, a neural network is designed to fuse these two modals for discriminating dementia patients from healthy controls. The network contains a cross-modal Transformer encoder for bimodal interaction, and a self-attention Transformer encoder for final classification. Experimental results demonstrate that the detection accuracy of the proposed method is 84.26%, which outperforms baseline methods and ablated models using single speech or eye-tracking input.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00