Skip to main content

THE NIO SYSTEM FOR AUDIO-VISUAL DIARIZATION AND RECOGNITION IN MISP CHALLENGE 2022

Gaopeng Xu (nio); Xianliang Wang (nio); Sang Wang (nio); junfeng yuan (nio); Wei Guo (nio); Wei Li (nio); Jie Gao (nio)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
10 Jun 2023

This paper describes NIO system for audio-visual diarization and recognition in the Multimodal Information Based Speech Processing (MISP) Challenge 2022. In our system, we proposed combining end-to-end audio-visual neural speaker diarization model and Channel-wise Av-fusion encoder with speaker signature for multi-channel audio-visual speech diarization and recognition. Our system reduces the concatenated minimum permutation character error rate(cpCER) by 34.36% absolute compared to the baseline in track 2.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00