Visual Relationship Detection with Language prior and Softmax

Jaewon Jung, Jongyoul Park

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Visual relationship detection is an intermediate image understanding task that detects two objects and classifies a predicate that explains the relationship between two objects in an image. The three components are linguistically and visually correlated (e.g. »wear» is related to »person» and »shirt», while »laptop» is related to »table» and »on») thus, the solution space is huge because there are many possible cases between them. Language and visual modules are exploited and a sophisticated spatial vector is proposed. The models in this work outperformed the state of arts without costly linguistic knowledge distillation from a large text corpus and building complex loss functions. All experiments were only evaluated on Visual Relationship Detection and Visual Genome dataset.

Original languageEnglish
Title of host publicationIEEE 3rd International Conference on Image Processing, Applications and Systems, IPAS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages143-148
Number of pages6
ISBN (Electronic)9781728102474
DOIs
StatePublished - 2 Jul 2018
Event3rd IEEE International Conference on Image Processing, Applications and Systems, IPAS 2018 - Sophia Antipolis, France
Duration: 12 Dec 201814 Dec 2018

Publication series

NameIEEE 3rd International Conference on Image Processing, Applications and Systems, IPAS 2018

Conference

Conference3rd IEEE International Conference on Image Processing, Applications and Systems, IPAS 2018
Country/TerritoryFrance
CitySophia Antipolis
Period12/12/1814/12/18

Keywords

  • Deep learning
  • Image understanding
  • Visual relationship

Fingerprint

Dive into the research topics of 'Visual Relationship Detection with Language prior and Softmax'. Together they form a unique fingerprint.

Cite this