Skip to main content

DEEP NEURAL NETWORK (DNN) AUDIO CODER USING A PERCEPTUALLY IMPROVED TRAINING METHOD

Seungmin Shin, Joon Byun, Youngcheol Park, Jongmo Sung, Seungkwon Beack

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:13:14
12 May 2022

A new end-to-end audio coder based on a deep neural net-work (DNN) is proposed. To compensate for the perceptualdistortion that occurred by quantization, the proposed coderis optimized to minimize distortions in both signal and per-ceptual domains. The distortion in the perceptual domainis measured using the psychoacoustic model (PAM), and aloss function is obtained through the two-stage compensa-tion approach. Also, the scalar uniform quantization was ap-proximated using a uniform stochastic noise, together with acompression-decompression scheme, which provides simplerbut more stable learning without an additional penalty thanthe softmax quantizer. Test results showed that the proposedcoder achieves more accurate noise-masking than the previ-ous PAM-based method and better perceptual quality then the MP3 audio coder.