Add like
Add dislike
Add to saved papers

Data Subset Selection With Imperfect Multiple Labels.

We study the problem of selecting a subset of weakly labeled data where the labels of each data instance are redundant and imperfect. In real applications, less-than-expert labels are obtained at low cost in order to acquire many labels for each instance and then used for estimating the ground truth. However, on one side, preparing and processing data itself sometimes can be even more expensive than labeling. On the other side, noisy labels also decrease the performance of supervised learning methods. Thus, we introduce a new quality control mechanism on labels for each instance and use it to select an optimal subset of data. Based on the quality control mechanism, in which the labeling quality of each instance is estimated, it provides a way to know which instance has enough reliable labels or how many labels still need to be collected for a data instance. In this paper, first, we consider the data subset selection problem under the probably approximately correct model. Then, we show how to find an ε-optimal labeled instance based on expected labeling quality. Furthermore, we propose new algorithms to select the best k quality instances that have high expected labeling quality. Using a reliable subset of data provides substantial benefit over using all data with imperfect multiple labels, and the expected labeling quality is a good indicator of where to allocate labeling effort. It shows how many labels should be acquired for an instance and which instances are qualified to be selected comparing with others. Both the theoretical guarantees and the comprehensive experiments demonstrate the effectiveness and efficiency of our algorithms.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app