Skip to main content

A NOVEL METRIC FOR EVALUATING AUDIO CAPTION SIMILARITY

Swapnil P Bhosale (TCS Research and Innovation); Rupayan Chakraborty (TCS Research); Sunil Kumar Kopparapu (TCS Research)

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
06 Jun 2023

Automatic Audio Captioning ( AAC ) refers to the task of describing an audio sample in a natural language (NL) text. Unlike NL text generation tasks, which rely on lexical semantic metrics like BLEU for evaluation, the AAC evaluation metric requires acoustic semantics to map NL text corresponding to similar sounds in addition to lexical semantics. In this paper, we propose a novel metric based on Text-to-Audio Grounding ( TAG ), to incorporate acoustic semantics. Experiments demonstrate our evaluation metric to perform better compared to existing metrics used in NL text and image captioning literature for AAC.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00