TRANSBUILDING: AN END-TO-END POLYGONAL BUILDING EXTRACTION WITH TRANSFORMERS
Mingming Zhang, Qingjie Liu, Wei Wang, Yunhong Wang
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In this paper, we propose a simple yet powerful network, called TransBuilding, for high-quality polygonal building extraction from remote sensing images. Unlike many previous methods that vectorize building masks through mask refinement and fitting or vertex prediction and assembling, our approach predicts the building vertex sequence with a vertex transformer (termed as VertexFormer) branch without any additional processing. The VertexFormer branch represents a polygon as a Bi-directional Ring without start or end vertex hypothesis, which leads to a simple and elegant representation of polygons avoiding ambiguous of defining the start vertex in polygons. Furthermore, three self-attention modules in row-wise, column-wise, and vertex-wise are integrated in parallel together to better capture geometric structures of building polygons. We graft the VertexFormer module onto the standard Faster RCNN detector and train the model end-to-endly using the novel Bi-Ring loss developed by the new perspective of Bi-directional Ring. Extensive experiments on the benchmark CrowdAI dataset demonstrate that our method outperforms state-of-the-art methods by considerable margins.