Investigation of multiple-choice question answering with mobile crowdsourcing
Aydin, Bahadir Ismail
MetadataShow full item record
With the rise of ubiquitous computing, crowdsourcing started to present solutions to the problems for which computers fall short. Most of the present crowdsourcing applications task the participants to answer open ended questions and there has been little work on using multiple-choice question answering (MCQA) for crowdsourcing. However, asking multiple-choice questions facilitates human collaboration and their answers are easier to aggregate. In this thesis, we investigate capabilities of crowdsourced MCQA systems. In order to design more effective aggregation methods and evaluate them empirically, we developed and deployed a crowdsourced system for playing "Who wants to be a millionaire? (WWTBAM) quiz game as the show is broadcasted on TV. Client side of the system is a native Android application, which is downloaded more than 300K. Our system scales to collect data from thousands of mobile devices in real time. Design of such a large and realtime crowdsourcing system is challenging. Especially, our experiments on distribution of questions show that multicasting to mobile devices are lossy and even well-established push notification services are subject to low distribution and high jitter rates. We discuss these challenges and present our design, and solutions to overcome them in the system architecture part of this thesis. Our system provided us sufficient data to experiment with MCQA algorithms on large scale. Using this data, we first set a foundation for our tests with basic voting, which can answer 94% of the easy questions and 63% of the hard questions correctly. Next, we present a detailed investigation of the factors on the aggregation accuracy such as the category of the questions, the expertise of the participants or the timing of the answers. Then, we present our algorithms which employ lightweight machine learning techniques using these features and their accuracy rates. Based on our observations, we design a super player algorithm using all the features in a hybrid fashion, and this algorithm is able to answer not only 97% of the easy questions but also 92% of the hard questions correctly. Our results show that crowdsourcing with MCQA is capable of producing very good results, which can hardly be achieved neither by crowdsourcing with open-ended questions nor with pure machine learning techniques.