Leveraging Side Information to Improve Label Quality Control in Crowd-Sourcing

Yuan Jin, Mark Carman, Dongwoo Kim, Lexing Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

We investigate the possibility of leveraging side information for improving quality control over crowd-sourced data. We extend the GLAD model, which governs the probability of correct labeling through a logistic function in which worker expertise counteracts item difficulty, by systematically encoding different types of side information, including worker information drawn from demographics and personality traits, item information drawn from item genres and content, and contextual information drawn from worker responses and labeling sessions. Modeling side information allows for better estimation of worker expertise and item difficulty in sparse data situations and accounts for worker biases, leading to better prediction of posterior true label probabilities. We demonstrate the efficacy of the proposed framework with overall improvements in both the true label prediction and the unseen worker response prediction based on different combinations of the various types of side information across three new crowd-sourcing datasets. In addition, we show the framework exhibits potential of identifying salient side information features for predicting the correctness of responses without the need of knowing any true label information.

Original languageEnglish
Title of host publicationProceedings of the 5th AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2017
EditorsSteven Dow, Adam Tauman
PublisherAAAI Press
Pages79-88
Number of pages10
ISBN (Electronic)9781577357933
Publication statusPublished - 27 Oct 2017
Event5th AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2017 - Quebec City, Canada
Duration: 24 Oct 201726 Oct 2017

Publication series

NameProceedings of the 5th AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2017

Conference

Conference5th AAAI Conference on Human Computation and Crowdsourcing, HCOMP 2017
Country/TerritoryCanada
CityQuebec City
Period24/10/1726/10/17

Fingerprint

Dive into the research topics of 'Leveraging Side Information to Improve Label Quality Control in Crowd-Sourcing'. Together they form a unique fingerprint.

Cite this