CTTSR: A Hybrid CNN-Transformer Network for Scene Text Image Super-Resolution

Kaiwei Dai (Central South University); Nan Kang (Central South University); Li Kuang (Central South University)

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

06 Jun 2023

The accuracy of scene text recognition has been significantly improved, which can be attributed to the development of deep learning. However, the blurring and low-resolution text images usually lead to unsatisfactory results in text recognition. Several researchers design super-resolution models that adopt convolutional neural networks (CNNs) to relieve the image blurring, while these models are limited to the receptive field of the convolution kernel and fail to extract the long-distance semantic relations of text images enough. In this paper, we propose a CNN-Transformer Text Super Resolution Network (CTTSR) to capture the semantic features of text images by the multi-head attention mechanism of the transformer. Furthermore, we propose the text position loss to optimize the network and make the text regions of images more effectively detectable. Experimental results demonstrate that our model can improve the quality of images and outperform the existing methods in text recognition tasks.

Tags:

Computational image formation

CTTSR: A Hybrid CNN-Transformer Network for Scene Text Image Super-Resolution

Kaiwei Dai (Central South University); Nan Kang (Central South University); Li Kuang (Central South University)

Value-Added Bundle(s) Including this Product

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

More Like This

An Edge Alignment-based Orientation Selection Method for Neutron Tomography

Beyond Neural-on-Neural Approaches to Speaker Gender Protection

Fast Multiscale 3D Reconstruction Using Single-Photon LiDaR Data

Join the IEEE Signal Processing Society