Add like
Add dislike
Add to saved papers

Bridging Visual and Textual Semantics: Towards Consistency for Unbiased Scene Graph Generation.

Scene Graph Generation (SGG) aims to detect visual relationships in an image. However, due to long-tailed bias, SGG is far from practical. Most methods depend heavily on the assistance of statistics co-occurrence to generate a balanced dataset, so they are dataset-specific and easily affected by noises. The fundamental cause is that SGG is simplified as a classification task instead of a reasoning task, thus the ability capturing the fine-grained details is limited and the difficulty in handling ambiguity is increased. By imitating the way of dual process in cognitive psychology, a Visual-Textual Semantics Consistency Network (VTSCN) is proposed to model the SGG task as a reasoning process, and relieve the long-tailed bias significantly. In VTSCN, as the rapid autonomous process (Type1 process), we design a Hybrid Union Representation (HUR) module, which is divided into two steps for spatial awareness and working memories modeling. In addition, as the higher order reasoning process (Type2 process), a Global Textual Semantics Modeling (GTS) module is designed to individually model the textual contexts with the word embeddings of pairwise objects. As the final associative process of cognition, a Heterogeneous Semantics Consistency (HSC) module is designed to balance the type1 process and the type2 process. Lastly, our VTSCN raises a new way for SGG model design by fully considering human cognitive process. Experiments on Visual Genome, GQA and PSG datasets show our method is superior to state-of-the-art methods, and ablation studies validate the effectiveness of our VTSCN. The source codes are released on GitHub: https://github.com/Nora-Zhang98/VTSCN.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app