We propose flexgrid2vec, a novel approach for image representation learning. Existing visual representation methods suffer from several issues, including the need for highly intensive computation, the risk of losing in-depth structural information and the specificity of the method to certain shapes or objects. flexgrid2vec converts an image to a low-dimensional feature vector. We represent each image with a graph of flexible, unique node locations and edge distances. flexgrid2vec is a multichannel GCN that learns features of the most representative image patches. We have investigated both spectral and nonspectral implementations of the GCN node-embedding. Specifically, we have implemented flexgrid2vec based on different nodeaggregation methods, such as vector summation, concatenation and normalisation with eigenvector centrality. We compare the performance of flexgrid2vec with a set of state-of-the-art visual representation learning models on binary and multi-class image classification tasks. Although we utilise imbalanced, low-size and low-resolution datasets, flexgrid2vec shows stable and outstanding results against well-known base classifiers. flexgrid2vec achieves 96:23% on CIFAR-10, 83:05% on CIFAR-100, 94:50% on STL-10, 98:8% on ASIRRA and 89:69% on the COCO dataset.
@article{DBLP:journals/corr/abs-2007-15444,
archiveprefix = {arXiv},
author = {Ali Hamdi and
Du Yong Kim and
Flora D. Salim},
bibsource = {dblp computer science bibliography, https://dblp.org},
biburl = {https://dblp.org/rec/journals/corr/abs-2007-15444.bib},
eprint = {2007.15444},
journal = {CoRR},
timestamp = {Mon, 03 Aug 2020 01:00:00 +0200},
title = {grid2vec: Learning Efficient Visual Representations via Flexible Grid-Graphs},
url = {https://arxiv.org/abs/2007.15444},
volume = {abs/2007.15444},
year = {2020}
}
© 2021 Flora Salim - CRUISE Research Group.