Ethical and Practical Considerations For Compensation of Crowdsourced Research Participants
Introduction
Greg Norcie Carnegie Mellon University 5000 Forbes Avenue Pittsburgh, PA 15213 ganorcie@andrew.cmu.edu
Abstract
In this paper we discuss the practical and ethical concerns raised by current pay rates for research participants on Mechanical Turk
The Human Computer Interaction community, like all research communities, strives to maintain ecological validity. However, recent papers suggest that much research in the social sciences use non-representative samples. In an attempt to increase diversity and lower costs, many HCI studies now utilize Amazon Mechanical Turk to crowdsourced user studies. This position paper attempts to show that the pay these workers receive presents an issue under the National Research Act of 1974, and that said low pay also has a homogenizing effect on the Mechanical Turk participant pool. In March 2010 Joe Henrich at the University of British Columbia recently published a paper entitled "The WEIRDest People In The World"[1] which points out that most research in the social sciences still focuses on "W.E.I.R.D." people: western, educated, industrialized, intelligent, rich and democratic. Henrich's nearly 80 page paper calls into question whether several cognitive science principles are truly endemic to all of humanity, since many studies which "proved" various theories in cognitive science utilized samples with the “W.E.I.R.D.” characteristics listed above.
Keywords
Mechanical Turk, crowdsourcing, research ethics
ACM Classification Keywords
Human Factors, Ethics
General Terms
H5.3 Group and Organization Interfaces, K.4.1 Public
Policy Issues: Ethics
Copyright is held by the author/owner(s). CHI 2011, May 7–12, 2011, Vancouver, BC, Canada. ACM 978-1-4503-0268-5/11/05.
What Is Mechanical Turk?
Amazon Mechanical Turk is a crowdsourcing application which takes its name from The Turk, a fake chess machine from the 1800s. The Turk is a famous 18th century hoax. The Turk was alleged to be a chess-playing automaton, but in actuality was controlled by a human operator hidden within the contraption. Amazon developed their own “Mechanical Turk” service as an internal project to spot duplicate product pages. As an incentive to use the system, users were paid a few cents for every duplicate they spotted. The service was so popular Amazon opened it up to the general public. On Mechanical tech people (“requesters”) can place human intelligence tasks ("HITs") up for human workers ("Turkers") to perform, in exchange for a preagreed fee. Some examples of typical HITs listed on the Mechanical Turk website[2] include correcting the spelling of search terms, deciding if two products are the same, or translating text from English to French. Most of these HITs are tasks that can't be easily automated, and require human intelligence, hence Mechanical Turk's tagline: "Artificial artificial intelligence." Most people posting HITs to the service use Mechanical Turk as a way to contract out repetitive, simple tasks that are computationally non-trivial due to the ill-defined nature of the problem (such as "Is this the appropriate category?") and/or the lack of a suitable algorithm (such as with natural language translation.)
Why Do We Use Mechanical Turk?
The HCI community sees the low cost of Mechanical Turk studies as a major benefit. The first paper to suggest using Mechanical Turk for user studies was a paper from CHI 2008 entitled “Crowdsourcing User Studies with Mechanical Turk” which stated: “User studies are important for many aspects of the design process and involve techniques ranging from informal surveys to rigorous laboratory studies. However, the costs involved in engaging users often requires practitioners to trade off between sample size, time requirements, and monetary costs. Micro-task markets, such as Amazon’s Mechanical Turk, offer a potential paradigm for engaging a large number of users for low time and monetary costs.” [3] Researchers have special ethical obligations. Experiments that would be completely legal for a private entity to perform would not pass the muster of a university Institutional Review Board. We, as researchers are held to a higher standard than the general public. So we must ask ourselves: just because we can pay Turkers such low wages, does it follow that we should? Even though Turkers and other research participants are technically independent contractors, and thus not subject to minimum wage laws, if researchers would voluntarily pay someone the US federal minimum wage of $7.25/hour[4] in a lab, why is it ethical to pay them approximately two dollars an hour online? We, the research community, claim to not want to coerce users with large payments, and this is a valid goal. But with such low wages, we may be polluting the Turker participant pool.
IRB Regulations Relating to Pay
Following a spate of several studies with dubious ethical foundations, such as the Stanford prison experiment, Congress passed the National Research Act of 1974, which required any university receiving federal research money to establish Institutional Review Boards (IRBs), which would monitor research ensure proper ethical guidelines are followed. For example, participants must give informed consent, so populations that might be unable to properly give consent, such at children, prisoners, pregnant women, human fetuses and neonates are given special protections by the IRB. IRBs generally feel that participants should be fairly compensated for participating in a research study. But IRBs also want to avoid coercing research subjects. This presents a unique problem. An excessively high amount of compensation can be coercive, and thus most IRBs ban such practices. For example, the University of Miami's IRB guidelines[5] state quite clearly: “Incentives, compensation and/or other inducements to subjects should reflect the risk, discomfort or inconvenience associated with study participation; and they should not be so large as to result in any one group of individuals (such as the economically disadvantaged) bearing an unduly large share of the risks and burdens of research participation.” Virginia Commonwealth University's IRB[6] echoes this sentiment, stating: "Payment for participation in research may not be offered to the subject as a means of coercive persuasion. Rather, it should be a form of recognition
for the investment of the subject's time, loss of wages, or other inconvenience incurred." However, it is the position of this author that Turker pay has swung too far in the opposite direction - with Turkers often being paid an effective hourly wage approximately 50% of the US Federal Minimum wage, we must ask ourselves whether it is ethical to pay participants using a crowd sourcing application half of what we would pay them in a laboratory setting for the same task. Researchers have special ethical obligations. Experiments that would be completely legal for a private entity to perform would not pass the muster of an Institutional Review Board. We as researchers are held to a higher standard than the general public. So we must ask ourselves: just because we can pay Turkers such low wages, does that mean we should? If we would pay someone the federal minimum wage of $7.25/hour to take a survey in a lab, or enter them in a raffle is it ethical to pay them two dollars an hour online?[7] While it is admirable to want to avoid coercion, it is also in the spirit of the regulations to compensate participants fairly. In addition to the ethical issues present with low Turkey pay, there are also very real effects on the participant pool.
The Effects of Low Pay on The Mechanical Turk Participant Pool
Even if we erase the ethical concerns, there are numerous practical concerns. The prevalence of such low wages is leading to a homogenization of the Mechanical Turk participant pool. "Who Are The
Crowdworkers: Shifting Demographics in Mechanical Turk" paints a bleak picture. In March of 2008 Indians were at 8% of the Turker population, by November 2009 Indians were 36% of the population. In the same time period, the median income of Turkers dropped from $40,000-$60,000 USD. By late 2009 nearly one third of Mechanical Turk users reported making less than $10,000 USD annually. [7] If we intend to use crowdsourcing tools like Mechanical Turk to try and make our samples more diverse, then low Turker pay might actually be harming our cause, leading a disproportionate number of poor Indian users to participate in Mechanical Turk studies.
representative sample. [8] A 36% Indian sample[7] is not representative. We must decide as a research community whether current research practices on Mechanical Turk are ethical, and if so, whether said practices are also sustainable.
Acknowledgements
The author thanks Dr. Lorrie Cranor of CMU for allowing him to divert time from other projects towards this submission.
Citations:
[1] Henrich, J., Heine, S. & Norenzayan, A. (2010) The Weirdest People in the World? Behavioral and Brain Sciences [2] https://mturk.com/mturk/welcome?variant=worker [3] Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with Mechanical Turk. In Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems (CHI '08) [4] http://www.dol.gov/elaws/faq/esa/flsa/001.htm [5]https://eprost.med.miami.edu/Eprost/Rooms/Displa yPages/LayoutInitial?Container=com.webridge.entity.E ntity[OID[EED589984B4E8B4B9C9E8A4348AEEF54]] [6] http://www.research.vcu.edu/irb/wpp/flash/XVII2.htm [7] Ross et al 2010. Who Are the Crowdworkers?: Shifting Demographics in Mechanical Turk. In Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems (CHI EA '10). [8] Lauren A. Schmidt. Crowdsourcing for Human Subjects Research. CrowdConf 2010
Conclusion
It is the opinion of this author that the current rates of pay for Mechanical Turk users present an ethical problem. We would either pay these works at least minimum wage in a lab. In the rare cases that we did not pay minimum wage, we would enter the participants in a raffle for some item worth much more than the minimum wage (such as a gift card or electronic device) so that at least one person was paid more than the minimum wage To follow a different standard just because a study is done online is not acceptable. Furthermore, even if we accept that current pay practices are ethical, said pay practices are causing undesirable demographic shifts in the Turker population. We as a research community justify use of Mechanical Turk by claiming it gathers a more