Font Size: a A A

Research On Crowdsourcing Quality Control Optimization Method Based On Maximum Likelihood Estimation

Posted on:2020-09-28Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhengFull Text:PDF
GTID:2370330572983645Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Crowdsourcing can solve many problems that cannot be effectively solved by machine algorithms,such as entity analysis,emotion analysis and image recognition.Crowdsourcing can solve these tasks by utilizing hundreds of thousands of workers(i.e.people)in the network.In addition,current public crowdsourcing platforms(such as Amazon Mechanical Turk(AMT),Crowdflow and Upwork)make it easier to use group resources.Especially in the field of machine learning and data mining,human intelligence support has been very successful.They collect label data for training various machine learning and data mining systems by publishing labelling tasks on crowdsourcing platforms,such as image labelling tasks.However,crowdsourcing may produce relatively low-quality results.Because workers in crowdsourcing may have different levels of expertise,untrained workers may not be able to accomplish certain tasks.Even malicious workers may deliberately give incorrect answers.Therefore,some quality control strategies are needed to ensure the quality of crowdsourcing task results,that is,after receiving workers’responses to tasks,modelling the quality of workers,and then inferring the true answer of tasks based on the workers’quality.The existing methods to solve quality control mainly use EM method to maximize likelihood value to estimate the quality of workers and the true answer to the task.However,EM-based methods can only provide local optimum estimation results,and the computational complexity of the global optimal solution is too large to achieveThis paper focuses on the of crowdsourcing quality control optimization problem based on maximum likelihood estimation.Firstly,we propose a local optimal crowdsourcing quality control algorithm based on maximum likelihood estimation.We use EM method to maximize the likelihood value to evaluate the quality of workers and the true answers of tasks.We propose crowdsourcing quality control algorithms based on static and dynamic worker models respectively.The static worker model uses probability values or probability matrices to represent the quality of workers.In the dynamic worker model,the quality of workers is affected by task difficulty and fits the functional distribution.The dynamic worker model can reflect the change rule of workers’ quality with the influence factors in more detail.After modeling the worker’s quality,we use EM method to estimate the parameters of the worker’s model and the true answer of the task.Then,we propose a crowdsourcing quality control approximate global optimum algorithm for maximum likelihood based on the local optimum results of EM method.Our optimization algorithm consists of a task dominance-ordering model and an iterative neighbor search algorithm,which improves the accuracy of real answer estimation by further likelihood maximization.Task dominance-ordering model can help us delete disadvantaged task-answer combination and retain the advantage of task-answer combination;Iterative Neighbor Search algorithm can find the task-answer combination with the maximum likelihood value in a neighborhood.Our optimization algorithm maximizes the likelihood value while providing the accuracy of estimating the true answer of task and the worker’s quality.Finally,we use the synthetic data and the real data which is collected on the AMT platform about the emotional analysis task.Through a large number of comparative experiments,we evaluate our proposed local optimal algorithm and approximate global optimal algorithm for crowdsourcing quality control.The experimental results show that our method can obtain better estimation results.In addition,we have implemented a crowdsourcing APP as an experimental platform,which can manage and distribute mobile crowdsourcing tasks(such as discount information labeling tasks)and collect crowdsourcing data,and we can also use crowdsourcing quality control related algorithms through this platform.
Keywords/Search Tags:Crowdsourcing Quality Control, Worker Model, EM, Maximum Likelihood Estimation, Optimization Algorithms
PDF Full Text Request
Related items