Freeview Sketching

View-Aware Fine-Grained Sketch-Based Image Retrieval

SketchX, CVSSP, University of Surrey, United Kingdom

While searching a specific photo, users are often confused as to ‘Which view should I sketch?’ Adding to their worries, models trained on existing FG-SBIR datasets that have sketch-photo pairs matched against a fixed view, fail to fetch a photo even when it’s present in the gallery, unless its view matches that of the query-sketch exactly. We aim to alleviate this issue and incorporate view-awareness in the FG-SBIR paradigm.

Abstract

In this paper, we delve into the intricate dynamics of FineGrained Sketch-Based Image Retrieval (FG-SBIR) by addressing a critical yet overlooked aspect – the choice of viewpoint during sketch creation. Unlike photo systems that seamlessly handle diverse views through extensive datasets, sketch systems, with limited data collected from fixed perspectives, face challenges. Our pilot study, employing a pre-trained FG-SBIR model, highlights the system’s struggle when query-sketches differ in viewpoint from target instances. Interestingly, a questionnaire however shows users desire autonomy, with a significant percentage favouring view-specific retrieval. To reconcile this, we advocate for a view-aware system, seamlessly accommodating both view-agnostic and view-specific tasks. Overcoming dataset limitations, our first contribution leverages multi-view 2D projections of 3D objects, instilling cross-modal view awareness. The second contribution introduces a customisable crossmodal feature through disentanglement, allowing effortless mode switching. Extensive experiments on standard datasets validate the effectiveness of our method.

Method

We aim to handle both view-agnostic and specific retrieval using one model.
 

Architecture

Our model disentangles an input into its view and content semantics. Sketchphoto pairs from FG-SBIR datasets are used to learn cross-modal discriminative knowledge, whereas multi-view 2D projections from unlabelled 3D models helps condition the encoder with view-aware knowledge.
Once trained, the content and view features are used for view-agnostic and view-specific retrieval as shown.

Results

image


image


image

BibTeX

@article{sain2024freeview,
  title={{Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval}},
  author={Aneeshan Sain and Pinaki Nath Chowdhury and Subhadeep Koley and Ayan Kumar Bhunia and Yi-Zhe Song},
  booktitle={ECCV},
  year={2024}
}

Copyright © Aneeshan Sain | Last updated: 24 Mar 2023 |Template Credit: Nerfies