Science & technology | Experimental psychology

The roar of the crowd

Crowdsourcing is transforming the science of psychology

May 26th 2012

ACCORDING to Joseph Henrich and his colleagues at the University of British Columbia, most undergraduates are WEIRD. Those who teach them might well agree. But Dr Henrich did not intend the term as an insult when he popularised it in a paper published in Behavioral and Brain Sciences in 2010. Instead, he was proposing an acronym: Western, Educated, Industrialised, Rich and Democratic.

One reason these things matter is that undergraduates are also psychology's laboratory rats. Incentivised by rewards, in the form of money or course credits, they will do the human equivalents of running mazes and pressing the levers in Skinner boxes until the cows come home.

Which is both a blessing and a problem. It is a blessing because it provides psychologists with an endless supply of willing subjects. And it is a problem because those subjects are WEIRD, and thus not representative of humanity as a whole. Indeed, as Dr Henrich found from his analysis of leading psychology journals, a random American undergraduate is about 4,000 times more likely than an average human being to be the subject of such a study. Drawing general conclusions about the behaviour of Homo sapiens from the results of these studies is risky.

This state of affairs, though, may be coming to an end. The main reasons undergraduates have been favoured in the past are that they are cheap, and easy for academics to recruit. But a new source of supply is now emerging: crowdsourcing.

Hivemind

Crowdsourcing is a way to get jobs like deciphering images, ranking websites and answering surveys done for money by online workers. Several firms offer the service, including oDesk, CrowdFlower and Elance. But by far the most popular for scientific purposes is Mechanical Turk, which is run by Amazon and is named after an 18th-century chess-playing machine in which a human secretly moved the pieces.

Mechanical Turk has more than 500,000 people, known as Turkers, in its workforce. For the hard-pressed, cash-strapped psychologist, this is a godsend. Turkers, despite the fact that half of them have at least one degree, are willing to work for peanuts. (Their median wage is about $1.40 an hour.) Most, indeed, seem to regard the tasks they are set as more like a paying hobby than an actual job. And, crucially, they are growing more cosmopolitan with each passing year. Though 40% are still from America, a third are Indian and the rest come from about 100 other countries. That diversity means the “W” of WEIRD, at least, can be dropped, and the “I”, “R” and “D” may often be dispensed with as well. Of course, another bias—that of signing up for crowdsourcing—is introduced. But using Turkers instead of undergrads does offer some genuine diversity.

One researcher who has taken advantage of that diversity is David Rand, a lecturer in psychology at Harvard University. He is using Mechanical Turk to reconsider the results of several experiments originally conducted mainly on students. In a recent study of moral decision-making, for example, he recruited hundreds of Turkers to repeat a classic thought experiment known as the trolley problem. This confronts its participants with a dilemma—a runaway railway trolley will kill a group of people unless the subject of the study chooses to push a single individual in front of it, in order to slow it down. Doing so will kill that individual, so the dilemma is whether to kill one person deliberately, or several through inaction.

Dr Rand is unwilling to discuss the results of his re-run in detail, because they have not yet been formally published. But he will say that he found he could replicate the prior findings of trolleyology, as this branch of psychology is often known, only among the atheists in his sample of Turkers. Those with strong religious beliefs behaved in a dramatically different way, and such believers are more common among Turkers than Harvard undergraduates.

This result suggests that other studies whose findings might be sensitive to religious belief need revisiting. Nor is religion the only area where this is true. Dr Rand is, for example, conducting another cross-cultural experiment, to see why Americans and western Europeans treat co-operation and punishment differently from people in other places. In this he is building on previous work, rather than breaking genuinely new ground. But he is also showing how crowdsourcing can permit psychologists to do easily and cheaply what was once complicated and expensive.

Many hands make light work

Most researchers used to think the punishment of freeloaders was a universal human instinct that had evolved to promote co-operation. Studies in the West supported this belief. They showed that people band together to reward co-operative behaviour and to punish those who refuse to contribute to the common good. These experiments, which employed what are known as public-goods games to test individual choices, gave players money they could either contribute to the group, raising the value of everyone's stake, or hold for themselves, ultimately harming everyone if others refuse to co-operate. But they were lacking in two ways. One was their WEIRD participants. The other was more subtle. It did not occur to the experimenters to allow participants to punish co-operators as well as freeloaders, even though those who had been freeloading might wish to do so in revenge for having been punished themselves, in previous rounds of the game.

But that did occur to Benedikt Herrmann of Nottingham university, in Britain. A few years ago Dr Herrmann ran a series of experiments designed to see how public-goods games would play out in 16 countries, not all of them rich and Western. This time, he allowed freeloaders to punish co-operators, a behaviour known as antisocial punishment. His results were striking. Most of the world, the experiments suggested, bears little resemblance to Harvard or, indeed, anywhere else in the West, where antisocial punishment is virtually absent. In places like South Korea, Greece, Russia and Saudi Arabia, antisocial punishment proved to be almost as common as collaboration.

Dr Rand is re-running Dr Herrmann's experiments on Mechanical Turk—at a tenth of the cost of the original work. The early results, published last year in Nature Communications, suggest Dr Herrmann was right. Punishment did not evolve, as conventional wisdom has it, as a positive behaviour intended to encourage co-operation. Instead, it evolved as a self-interested weapon to fend off competitors, even when that competition is, in fact, a strategy of collaboration. In places where rules and institutions do not protect co-operators, freeloaders consistently dominate.

Dr Rand's work is just a foretaste of what is possible. The ability to run experiments quickly, cheaply and globally promises to transform psychologists' understanding of human behaviour. Studies that would once have required months or years can now be done in days. Indeed, the technology of crowdsourcing itself may be modified by psychologists' interest in it. Until recently, one constraint on the experiments was the inability of Turkers to interact with each other in real time. That problem has now gone. Siddharth Suri, of Microsoft's New York research laboratory, has solved it by writing a piece of software that allows as many as 60 Turkers to interact in real time—a number that is expected to rise in the near future.

There are still plenty of kinks to be ironed out. Like anything else on the internet, those who use Mechanical Turk and its competitors are liable to spamming and to receiving answers from software “bots” pretending to be real people. And Turkers, despite being more diverse than undergraduates, are still a pretty skewed sample of humanity. In particular, they are younger and more liberal than people at large.

Questions of ethics have also arisen. Some people think research projects which pay wages of less than $2 an hour are exploitative—even though that is the going rate for other Turker activities. Conversely, according to Karen Fort, of France's Institute of Scientific and Technical Information, at least one university has already prohibited the use of grant funds for this sort of study, for fear that Turkers could claim status as employees.

For many researchers, though, the appeals of crowdsourcing—bargain prices, vast supply and enormous scale—are too attractive to ignore. Indeed, the new methodology might democratise the very practice of psychology, allowing those without a laboratory or university behind them to join in as well. Gabriele Paolacci, a marketing researcher at the Rotterdam School of Management who was once in precisely that position, has started a blog called “Experimental Turk” (experimentalturk.wordpress.com) to help draft guidelines for such freelance experiments.

The revolution, then, has begun. So far, Google Scholar, a website devoted to academic matters, counts 3,000 published papers that involve crowdsourced experiments. Discussions at conferences, among psychologists, behavioural economists, political scientists, linguists and computer scientists, suggest that may be the tip of the iceberg. It would be an exaggeration to say that crowdsourcing has turned the whole world into a laboratory. But it has certainly made psychology a lot less WEIRD.

This article appeared in the Science & technology section of the print edition under the headline "The roar of the crowd"