2,801 crowdsourcing and crowdfunding sites
Last week, Newsweek published an article titled The Real Minimum Wage. The authors report that "in a weeks-long experiment, we posted simple, hourlong jobs (listening to audio recordings and counting instances of a specific keyword) and continually lowered our offer until we found the absolute bottom price that multiple people would accept, and then complete the task."
The results "showed" that Americans are the ones willing to accept the lowest possible salary for working on a task, compared even to people in India, Romania, Philippines, etc. In fact, they found the that there are Americans willing to work for 25 cents per hour, while they could not find anyone willing to work for less than $1/hr in any other country. The conclusion of the article? Americans are more desperate than anyone else in the world.
What is the key problem of this study? There are many more US-based workers on Mechanical Turk compared to other nationalities. So, if you have a handful of workers from other countries, and hundreds of workers from the US, you are guaranteed to find more extreme findings for the US. Why? To put it simply, you are searching harder within the US to find small values, compared to the effort placed on other countries. (There are other issues as well, e.g., workers that would work on this task are not necessarily representative of the overall population; the same workers are exposed to multiple, decreasing salaries, issues of anchoring, issues of workers falsely reporting to be from the US, whether the authors checked IP geo-location, etc. While all these are valid concerns, they are secondary to the very basic statistical problem.)
Finding a Minimum Value: A Probabilistic Approach
On an abstract, statistical level, by testing workers from multiple countries, to determine their minimum wage, we sample multiple "minimum wage distributions" trying to find the smallest value within each one of them.
Each probability distribution corresponds to the minimum wages that workers from different countries are willing to accept. The probability calculation can be found here.
So, if we sample n workers, set the minimum wage at z=0.25, and assume uniform distribution for F, then F($0.25)=0.025 and the probability that we will find at least one worker willing to work for 25 cents is:
Plotting this, as a function of n, we have the following:
As we get more and more workers, the more likely it is to find a value that will be at or below 25 cents/hour.
So, how this approach explains the findings of Newsweek?
We know that all countries are not equally represented on Mechanical Turk. Most workers are from the US (50% or so), followed by India (35% or so), and then by Canada (2%), UK (2%), Philippines (2%), and a variety of other countries with similarly small percentages. This means that in the study, we expect to have more Americans participating, followed by Indians, and then a variety of other countries. So, even if the distribution of minimum wages was identical across all countries, we expect to find lower wages in the country with the largest number of participants.
Since the majority of the workers on Mechanical Turk are from US, followed by India, followed by Canada, and UK, etc, the illustration by Newsweek simply gives us the country of origin of the workers, in reverse order of popularity!
At this point, someone may ask: what happens if the distribution is not uniform but, say, lognormal? (A much more plausible distribution for minimum acceptable wages.) For this specific question, as you can see from the analysis [here], this does not make much of a difference: The only thing that we need to know if the value of F(z) for the z value of interest.
Going in depth: Extreme Value Theory
A more general question is: What is the expected maximum (or minimum) value that we expect to find when we sample from an arbitrary distribution? This is the topic of extreme value theory, a field in statistics that tries to predict the probability of extreme events (e.g., what is the possible biggest possible drop in the stock market? what is the biggest rainfall in this region?) Given the events in the financial markets in 2008, this theory has received significant attention in the last few years.
What is nice about this theory is that the fundamentals can be summarized very succinctly. The Fisher–Tippett–Gnedenko theorem states that, if we sample from a distribution, the maximum values that we expect to find will be a random variable, belonging to one of the three distributions:
The three types of the distributions are all special cases of the generalized extreme value distribution.
This theory has significant applications not only when modeling risk (stock market, weather, earthquakes, etc), but also when modeling decision-making for humans: Often, we model humans as utility maximizers, who are making decisions that maximize their own well-being. This maximum-seeking behavior results often in the distributions described above.
By Panos Ipeirotis
Panos is an Associate Professor at the IOMS Department at Stern School of Business of New York University. He is interested in crowdsourcing and on leveraging economics to solve computer science problems.