TY - JOUR
T1 - Visual Permutation Learning
AU - Cruz, Rodrigo Santa
AU - Fernando, Basura
AU - Cherian, Anoop
AU - Gould, Stephen
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2019/12/1
Y1 - 2019/12/1
N2 - We present a principled approach to uncover the structure of visual data by solving a deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Permutation matrices are discrete, thereby posing difficulties for gradient-based optimization methods. To this end, we resort to a continuous approximation using doubly-stochastic matrices and formulate a novel bi-level optimization problem on such matrices that learns to recover the permutation. Unfortunately, such a scheme leads to expensive gradient computations. We circumvent this issue by further proposing a computationally cheap scheme for generating doubly stochastic matrices based on Sinkhorn iterations. To implement our approach we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on three challenging computer vision problems, namely, relative attributes learning, supervised learning-to-rank, and self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for relative attributes learning, chronological and interestingness image ranking for supervised learning-to-rank, and competitive results in the classification and segmentation tasks of the PASCAL VOC dataset for self-supervised representation learning.
AB - We present a principled approach to uncover the structure of visual data by solving a deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Permutation matrices are discrete, thereby posing difficulties for gradient-based optimization methods. To this end, we resort to a continuous approximation using doubly-stochastic matrices and formulate a novel bi-level optimization problem on such matrices that learns to recover the permutation. Unfortunately, such a scheme leads to expensive gradient computations. We circumvent this issue by further proposing a computationally cheap scheme for generating doubly stochastic matrices based on Sinkhorn iterations. To implement our approach we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on three challenging computer vision problems, namely, relative attributes learning, supervised learning-to-rank, and self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for relative attributes learning, chronological and interestingness image ranking for supervised learning-to-rank, and competitive results in the classification and segmentation tasks of the PASCAL VOC dataset for self-supervised representation learning.
KW - Permutation learning
KW - learning-to-rank
KW - relative attributes
KW - representation learning
KW - self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85074545837&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2018.2873701
DO - 10.1109/TPAMI.2018.2873701
M3 - Article
SN - 0162-8828
VL - 41
SP - 3100
EP - 3114
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 12
M1 - 8481554
ER -