A Deep Multimodal Approach For Map Image Classification
Tomoya Sawada, Marie Katsurai
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:04
Map images (e.g., illustrated maps, historical maps, and geographic maps) have been published around the world, not only for giving location but also to attract tourists or hand down the histories of locations. The management of map data, however, has been an open issue for several research fields, including digital library, humanities, and tourism studies. This paper explores an approach for classifying diverse map images by their themes using map content features. Specifically, we present a novel strategy for preprocessing text data that are positioned inside the map images, which are extracted using OCR. The activation of the textual feature-based model is joint with the visual features in an early fusion manner. Finally, we train a classifier model comprising a convolutional layer and a fully connected layer, which predicts the belonging class of the input map. In experiments conducted on a new labeled dataset of map images, we demonstrate that our approach that uses the fused features achieved the best classification performance over single modality. We have made our dataset available on the Internet to facilitate this new task.