Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 15:04
04 May 2020

Crowd counting from unconstrained and congested scenes is an important task in computer vision. Its main difficulties stem from large scale/density variation and prone to overfitting. This paper presents a novel end-to-end stochastic multi-scale aggregation network (SMANet) which carefully addresses these issues. Specifically, general features are first extracted by the front-end subnetwork and then fed into the back-end subnetwork which consists of stochastic multi-scale aggregation module, density map generator, and global prior encoder. The stochastic aggregation impels the multi-branch units to learn features at different scales effectively and reduces sensitivity to scale variations, whereas the global prior encoder is designed to encode global contextual information and guarantee density consistency of shared representations. Our proposed SMANet is the first work to fuse multi-scale features in a stochastic manner for crowd counting. Experimental results on four public datasets demonstrate that our SMANet consistently outperforms the state-of-the-arts.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00