Content area

Abstract

Neural networks have been shown effective at learning rich low-dimensional representations of high-dimensional data such as images and text. There has also been many recent works using neural networks to learn a common embedding between data of different modes, specifically between images and textual descriptions, a task commonly referred to as learning visual-semantic embeddings. This is typically achieved using a separate encoder for images and text and a contrastive loss. Inspired by recent works in relational reasoning and graph neural networks, this work studies the effects of using a relational inductive bias on the quality of learned visual-semantic embeddings. Training and evaluation is done using caption-to-image and image-to-caption retrieval on the MS-COCO dataset.

Details

Title
Relational Inductive Biases for Visual-Semantic Embeddings
Author
Yan, Zhiyao
Publication year
2019
Publisher
ProQuest Dissertations & Theses
ISBN
9781085651325
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
2298207511
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.