SPATIO-TEMPORAL GRAPH CONVOLUTIONAL NETWORKS FOR CONTINUOUS SIGN LANGUAGE RECOGNITION

Maria Parelli, Petros Maragos, Katerina Papadimitriou, Gerasimos Potamianos, Georgios Pavlakos

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:08:32

13 May 2022

We address the challenging problem of continuous sign language recognition (CSLR) from RGB videos, proposing a novel deep-learning framework that employs spatio-temporal graph convolutional networks (ST-GCNs), which operate on multiple, appropriately fused feature streams, capturing the signer?s pose, shape, appearance, and motion information. In addition to introducing such networks to the continuous recognition problem, our model?s novelty lies on: (i) the feature streams considered and their blending into three ST-GCN modules; (ii) the combination of such modules with bi-directional long short-term memory networks, thus capturing both short-term embedded signing dynamics and long-range feature dependencies; and (iii) the fusion scheme, where the resulting modules operate in parallel, their posteriors aligned via a guiding connectionist temporal classification method, and fused for sign gloss prediction. Notably, concerning (i), in addition to traditional CSLR features, we investigate the utility of 3D human pose and shape parameterization via the ?ExPose? approach, as well as 3D skeletal joint information that is regressed from detected 2D joints. We evaluate the proposed system on two well-known CSLR benchmarks, conducting extensive ablations on its modules. We achieve the new state-of-the-art on one of the two datasets, while reaching very competitive performance on the other.

Tags:

continuous sign language recognition

spatio- temporal graph convolutional networks

expose

ctc

bilstm

SPATIO-TEMPORAL GRAPH CONVOLUTIONAL NETWORKS FOR CONTINUOUS SIGN LANGUAGE RECOGNITION

Maria Parelli, Petros Maragos, Katerina Papadimitriou, Gerasimos Potamianos, Georgios Pavlakos

Value-Added Bundle(s) Including this Product

ICASSP 2022, May 2022 Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

RUN-AND-BACK STITCH SEARCH: NOVEL BLOCK SYNCHRONOUS DECODING FOR STREAMING ENCODER-DECODER ASR

INVESTIGATING SEQUENCE-LEVEL NORMALISATION FOR CTC-LIKE END-TO-END ASR

DELIBERATION OF STREAMING RNN-TRANSDUCER BY NON-AUTOREGRESSIVE DECODING

Join the IEEE Signal Processing Society