Add like
Add dislike
Add to saved papers

Geometric Matching for Cross-Modal Retrieval.

Despite its significant progress, cross-modal retrieval still suffers from one-to-many matching cases, where the multiplicity of semantic instances in another modality could be acquired by a given query. However, existing approaches usually map heterogeneous data into the learned space as deterministic point vectors. In spite of their remarkable performance in matching the most similar instance, such deterministic point embedding suffers from the insufficient representation of rich semantics in one-to-many correspondence. To address the limitations, we intuitively extend a deterministic point into a closed geometry and develop geometric representation learning methods for cross-modal retrieval. Thus, a set of points inside such a geometry could be semantically related to many candidates, and we could effectively capture the semantic uncertainty. We then introduce two types of geometric matching for one-to-many correspondence, i.e., point-to-rectangle matching (dubbed P2RM) and rectangle-to-rectangle matching (termed R2RM). The former treats all retrieved candidates as rectangles with zero volume (equivalent to points) and the query as a box, while the latter encodes all heterogeneous data into rectangles. Therefore, we could evaluate semantic similarity among heterogeneous data by the Euclidean distance from a point to a rectangle or the volume of intersection between two rectangles. Additionally, both strategies could be easily employed for off-the-self approaches and further improve the retrieval performance of baselines. Under various evaluation metrics, extensive experiments and ablation studies on several commonly used datasets, two for image-text matching and two for video-text retrieval, demonstrate our effectiveness and superiority.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app