CONVOLUTIONAL RECURRENT NEURAL NETWORKS FOR THE CLASSIFICATION OF CETACEAN BIOACOUSTIC PATTERNS
Dimitris Makropoulos (National Technical University of Athens); Antigoni Tsiami (National Technical University of Athens); Aristides M Prospathopoulos (HCMR); DIMITRIS KASSIS (HCMR); Alexandros Frantzis (Pelagos Cetacean Research Institute); Emmanuel Skarsoulis (Foundation of Research and Technology - HELLAS); George Piperakis (Foundation of Research and Technology -HELLAS); Petros Maragos (National Technical University of Athens)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In this paper we focus on the development of a convolutional
recurrent neural network (CRNN) to categorize biosignals
collected in the Hellenic Trench, generated by two
cetacean species, sperm whales (Physeter macrocephalus)
and striped dolphins (Stenella coeruleoalba). We convert
audio signals into mel-spectrograms and forward the input
into a deep residual network (ResNet), designed to capture
spectral patterns. Next, ResNet’s output is reshaped into a
time-distributed layer and fed into recurrent network variants,
Long Short-Term Memory (LSTMs) or Gated Recurrent
Units (GRUs), able to recognize long-term time dependencies
on extracted features. The hybrid network perfectly classifies
audio signals into three categories (dolphins, sperm whales,
ambient noise) while it also exhibits high learning ability on
recognising intraclass representations of overlapping acoustic
patterns (clicks vs whistles and clicks, both emitted by
dolphins). The proposed scheme outperforms traditional Machine
Learning (ML) techniques, baseline ResNet and LSTM
architectures or their deep parallel combinations.