Visual Permutation Learning

Rodrigo Santa Cruz*, Basura Fernando, Anoop Cherian, Stephen Gould

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    36 Citations (Scopus)

    Abstract

    We present a principled approach to uncover the structure of visual data by solving a deep learning task coined visual permutation learning. The goal of this task is to find the permutation that recovers the structure of data from shuffled versions of it. In the case of natural images, this task boils down to recovering the original image from patches shuffled by an unknown permutation matrix. Permutation matrices are discrete, thereby posing difficulties for gradient-based optimization methods. To this end, we resort to a continuous approximation using doubly-stochastic matrices and formulate a novel bi-level optimization problem on such matrices that learns to recover the permutation. Unfortunately, such a scheme leads to expensive gradient computations. We circumvent this issue by further proposing a computationally cheap scheme for generating doubly stochastic matrices based on Sinkhorn iterations. To implement our approach we propose DeepPermNet, an end-to-end CNN model for this task. The utility of DeepPermNet is demonstrated on three challenging computer vision problems, namely, relative attributes learning, supervised learning-to-rank, and self-supervised representation learning. Our results show state-of-the-art performance on the Public Figures and OSR benchmarks for relative attributes learning, chronological and interestingness image ranking for supervised learning-to-rank, and competitive results in the classification and segmentation tasks of the PASCAL VOC dataset for self-supervised representation learning.

    Original languageEnglish
    Article number8481554
    Pages (from-to)3100-3114
    Number of pages15
    JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
    Volume41
    Issue number12
    DOIs
    Publication statusPublished - 1 Dec 2019

    Fingerprint

    Dive into the research topics of 'Visual Permutation Learning'. Together they form a unique fingerprint.

    Cite this