Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels.

Peter Washington, Haik Kalantarian, Jack Kent, Arman Husic, Aaron Kline, Emilie Leblanc, Cathy Hou, Cezmi Mutlu, Kaitlyn Dunlap, Yordan Penev, Nate Stockham, Brianna Chrisman, Kelley Paskov, Jae-Yoon Jung, Catalin Voss, Nick Haber, Dennis P Wall

Cognitive Computation 2021 September

Background/Introduction: Emotion detection classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle compound and ambiguous labels. We explore the feasibility of using crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We hypothesize that training with labels that are representative of the diversity of human interpretation of an image will result in predictions that are similarly representative on a disjoint test set. We also hypothesize that crowdsourcing can generate distributions which mirror those generated in a lab setting.

Methods: We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold standard collection of images depicting pediatric facial expressions along with 100 human labels per image. To test the feasibility of crowdsourcing to generate these labels, we used Microworkers to acquire labels for 207 CAFE images. We evaluate both unfiltered workers as well as workers selected through a short crowd filtration process. We then train two versions of a ResNet-152 neural network on soft-target CAFE labels using the original 100 annotations provided with the dataset: (1) a classifier trained with traditional one-hot encoded labels, and (2) a classifier trained with vector labels representing the distribution of CAFE annotator responses. We compare the resulting softmax output distributions of the two classifiers with a 2-sample independent t-test of L1 distances between the classifier's output probability distribution and the distribution of human labels.

Results: While agreement with CAFE is weak for unfiltered crowd workers, the filtered crowd agree with the CAFE labels 100% of the time for happy, neutral, sad and "fear + surprise", and 88.8% for "anger + disgust". While the F1-score for a one-hot encoded classifier is much higher (94.33% vs. 78.68%) with respect to the ground truth CAFE labels, the output probability vector of the crowd-trained classifier more closely resembles the distribution of human labels (t=3.2827, p=0.0014).

Conclusions: For many applications of affective computing, reporting an emotion probability distribution that accounts for the subjectivity of human interpretation can be more useful than an absolute label. Crowdsourcing, including a sufficient filtering mechanism for selecting reliable crowd workers, is a feasible solution for acquiring soft-target labels.

Full text links

We have located links that may give you full text access.

Show additional links to paperHide additional links to paper

PubMed

Add to Saved Papers

Get 1-tap access

Related Resources

Renin-Angiotensin-Aldosterone System: From History to Practice of a Secular Topic.Sara H Ksiazek et al.International Journal of Molecular Sciences 2024 April 5

Prevention and treatment of ischaemic and haemorrhagic stroke in people with diabetes mellitus: a focus on glucose control and comorbidities.Simona Sacco et al.Diabetologia 2024 April 17

British Society for Rheumatology guideline on management of adult and juvenile onset Sjögren disease.Elizabeth J Price et al.Rheumatology 2024 April 17

Albumin: a comprehensive review and practical guideline for clinical use.Farshad Abedi, Batool Zarei, Sepideh ElyasiEuropean Journal of Clinical Pharmacology 2024 April 13

British Society of Gastroenterology guidelines for the management of hepatocellular carcinoma in adults.Abid Suddle et al.Gut 2024 April 17

Revascularization Strategy in Myocardial Infarction with Multivessel Disease.Alexander Jobs et al.Journal of Clinical Medicine 2024 March 27

For the best experience, use the Read mobile app

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels.

Full text links

Related Resources

Trending Papers

For the best experience, use the Read mobile app