Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
Poster 10 Oct 2023

Extracting and aggregating multiple feature representations from various scales have become the key to point cloud tasks. Although Vision Transformer (ViT) is currently popular for processing point clouds, it lacks adequate multi-scale features and interaction among them, which is vital for identifying structural details in the point cloud. In addition, learning efficient and effective representation from the point cloud is challenging due to its irregular, unordered, and sparse nature. Inspired by these, we propose a novel multi-scale representation learning transformer framework employing varied geometric features beyond common Cartesian coordinates. Our approach enriches the descriptions of point clouds by local geometric relationships and then are grouped them at multiple scales. This scale information is aggregated together and then new patches are extracted that minimize features overlay. The bottleneck projection head follows to enhance the information and fed all patches to the multi-head attention to capture the deep dependencies among representations across patches. Evaluation of public benchmark datasets shows the competitive performance of our framework on point cloud classification.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00