DL-NET: DILATION LOCATION NETWORK FOR TEMPORAL ACTION DETECTION

Dianlong You (yanshan university); Houlin Wang (yanshan university); Bingxin Liu (yanshan university); Yang Yu (yanshan university); Zhiming Li (yanshan university)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

Temporal Action Detection(TAD) is a challenge task in video understanding. The current methods mainly use global features for boundary matching or predefine all possible proposals, while ignoring long context information and local action boundary features, resulting in the decline of detection accuracy. To fill this gap, we propose a Dilation Location Network (DL-Net) model to generate more precise action boundaries by enhancing boundary features of actions and aggregating long contextual information in this paper. Specifically, we design the boundary feature enhancement (BFE) block, which strengthens the actions boundary feature and fuses the similar feature of different channel by pooling and channel squeezing. Meanwhile, in action location, we design multiple dilated convolution structures to aggregate long contextual information of time point/interval. We conduct extensive experiments on ActivityNet-1.3 and Thumos14 show that DL-Net is capable of enhancing action boundary features and aggregating long contextual information effectively.

Tags:

Image and video content analysis

DL-NET: DILATION LOCATION NETWORK FOR TEMPORAL ACTION DETECTION

Dianlong You (yanshan university); Houlin Wang (yanshan university); Bingxin Liu (yanshan university); Yang Yu (yanshan university); Zhiming Li (yanshan university)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

OPT: One-shot Pose-Controllable Talking Head Generation

ENHANCED GM-PHD FILTER FOR REAL TIME SATELLITE MULTI-TARGET TRACKING

Semi-Federated Learning for Edge Intelligence with Imperfect SIC

Join the IEEE Signal Processing Society