Efficient Scalable 360-Degree Video Compression Scheme Using 3D Cuboid Partitioning
Fariha Afsana, Manoranjan Paul, Manzur Murshed, David Taubman
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:13:01
Hough voting based on PointNet++[1] is effective against 3D object detection, which has been verified by VoteNet[2], H3DNet[3], etc. However, we find there is still room for improvements in two aspects. The first is that most existing methods ignores the particular significance of different format inputs and geometric primitives for predicting object proposals. The second is that the feature extracted by PointNet++ overlooks contextual information about each object. in this paper, to tackle the above issues, we introduce MCGNet to learn multi-level geometric-aware and scale-aware contextual information for 3D object detection. Specifically, our network mainly consists of the baseline module based on H3DNet, geometric-aware module, and context-aware module. The baseline module feeding with four-types inputs (Point, Edge, Surface, and Line) concentrates on extracting diversified geometric primitives, i.e., BB centers, BB face centers, and BB edge centers. The geometric-aware module is proposed to learn the different contributions among the four-types feature maps and the three geometric primitives. The context-aware module aims to establish long-range dependencies features for either four-types feature maps or three geometric primitives. Extensive experiments on two large datasets with real 3D scans, SUN RGB-D and ScanNet datasets, demonstrate that our method is effective against 3D object detection.