Semantic-Based Sentence Recognition In Images Using Bimodal Deep Learning

Yi Zheng, Qitong Wang, Margrit Betke

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 00:09:37

21 Sep 2021

The accuracy of computer vision systems that understand sentences in images with text can be improved when semantic information about the text is utilized. Nonetheless, the semantic coherence within a region of text in natural or document images is typically ignored by state-of-the-art systems, which identify isolated words or interpret text word by word. However, when analyzed together, seemingly isolated words may be easier to recognize. On this basis, we propose a novel ƒ??Semantic-based Sentence Recognitionƒ?� (SSR) deep learning model that reads the text in images with the help of understanding context. SSR consists of a Word Ordering and GroupingAlgorithm (WOGA) to find sentences in images and a Sequence-to-Sequence Recognition Correction (SSRC) model to extract semantic information in these sentences to improve their recognition. To show the effectiveness and generality of SSR in recognizing text, we present experiments with three notably distinct datasets, two of which we created ourselves. They respectively contain scanned catalog images of interior designs and photographs of protesters with hand-written signs. Our results show that SSR statistically significantly outperforms a baseline method that uses state-of-the-art single-word-recognition techniques on these three datasets. By successfully combining both computer vision and natural language processing methodologies, we reveal the important opportunity bi-modal deep learning can provide in addressing a task that was previously considered a single-modality computer vision task.

Tags:

signal processing society

IEEE icip 2021

september 19-22

virtual conference

2021

sps

virtual conference icip 2021

icip 2021

Semantic-Based Sentence Recognition In Images Using Bimodal Deep Learning

Yi Zheng, Qitong Wang, Margrit Betke

Value-Added Bundle(s) Including this Product

ICIP 2021 Virtual Conference - Presentation Videos Product Bundle

More Like This

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Panel: Charting the Course for Future-Ready Data Centers in the Era of Sustainability

Join the IEEE Signal Processing Society