HOUGHENCODER: NEURAL NETWORK ARCHITECTURE FOR DOCUMENT IMAGE SEMANTIC SEGMENTATION
Alexander Sheshkus, Nikolaev Dmitry, Vladimir L Arlazarov
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 10:00
In this paper, we propose a HoughEncoder neural network architecture for the semantic image segmentation task. The main feature of the proposed architecture is that it contains layers calculating direct and transposed integral operators, namely Fast Hough Transform. These layers split deep fully-convolutional architecture into three blocks. Therefore, the neural network inherits a possibility to make a decision in every point using integral features along different lines. It is important, that by doing this we do not increase the complexity of the neural network in terms of the number of trainable parameters. Our experiments on the publicly available datasets MIDV-500 and MIDV-2019 (both train and test) show that the suggested modification greatly increases quality. HoughEncoder outperforms UNet which shows state-of-the-art results in many semantic image segmentation tasks even while it has a one hundred times fewer parameters.