Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 12:13
04 May 2020

Building on prior work we have developed a no-reference (NR) waveform-based convolutional neural network (CNN) architecture that can accurately estimate speech quality or intelligibility of narrowband and wideband speech segments. These Wideband Audio Waveform Evaluation Networks, or WAWEnets, achieve very high per-speech-segment correlation (?_seg ? 0.92, RMSE ? 0.38) to established full-reference quality and intelligibility estimators (PESQ, POLQA, PEMO, STOI) based on over 17 hours of speech from 127 previously unseen talkers speaking in 13 different languages; just 10% of our total data. NR correlations at this level across this broad scope are unprecedented. This achievement was made possible by using FR estimates as training targets so that WAWEnets could learn implicit undistorted speech models and exploit them to produce accurate NR estimates.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00