A Deep Multimodal approach for Map Image Classification

Abstract

Map images have been published around the world. The management of map data, however, has been an open issue for several research fields, This paper explores an approach for classifying diverse map images by their themes using map content features. Specifically, we present a novel strategy for preprocessing text data that are positioned inside the map images, which are extracted using OCR. The activation of the textual feature-based model is joint with the visual features in an early fusion manner. Finally, we train a classifier model which predicts the belonging class of the input map. We have made our dataset available here to facilitate this new task.

Download(Dataset)

If you use our dataset, please refer to the following paper.

Reference

T. Sawada and M. Katsurai, “A Deep Multimodal Approach for Map Image Classification,” in 2020 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), accepted for publication.

PDF