Download the dataset here.
Download the slides for our talk at CSCW here.
@inproceedings{hata2017glimpse, title={A Glimpse Far into the Future: Understanding Long-term Crowd Worker Quality}, author={Hata, Kenji and Krishna, Ranjay and Fei-Fei, Li and Bernstein, Michael}, booktitle={CSCW: Computer-Supported Cooperative Work and Social Computing}, year={2017} }
The Visual Genome project contains multiple datasets that were collected over a long period of time. We focus our analysis on the image description dataset, the visual question answering dataset and the verification dataset. In total, we analyze 6.4 thousand crowd workers who contributed 8.89 million annotations over a period of 9 months.
We found that across all 3 datasets, workers were consistent in their microtask submissions. Contrary to prior literation on satisficing, workers maintain a similar quality of work over time, but also get more efficient as they gain experience with the task. Only a small proportion of workers are affected by fatigue from continuous monotonous tasks. The paper goes into more details about not worker's consistent with regards, to accuracy, word diversity as well as annotation speed. Read the paper to learn more.
To further investigate the cause of this consistency, we analyzed 1.1 thousand additional workers who contributed 0.67 million additional annotations under varying experimental conditions. We found that process-centric factors such as the acceptance threshold (whether high at 96% or low at 70%) used to accept or reject work had no significant impact on accuracy of annotations. Workers were consistent whether they knew (high transparency) or didn't know (low transparency) what these acceptance thresholds were. However, we did find that workers would self-select themselves out of tasks they couldn't complete effectively.
Attaining high quality judgments from crowd workers is often seen as a challenge. Given that workers perform consistently over time, we suggest its implications in crowdsourcing design. We show that models can be trained using a worker's first few submissions to predict their performance in the future. We reinforce that person-centric strategies should be employed when creating tasks.
Read the paper for more...