We need to talk about mechanical Turk: what 22,989 hypothesis tests tell us about publication bias and P-hacking in online experiments

Abel Brodeur, Nikolai Cook, Anthony Heyes

Research output: Working paper/PreprintPreprint

Abstract

Amazon Mechanical Turk is a very widely-used tool in business and economics research, but how trustworthy are results from well-published studies that use it? Analyzing the universe of hypotheses tested on the platform and published in leading journals between 2010 and 2020 we find evidence of widespread p-hacking, publication bias and over-reliance on results from plausibly under-powered studies. Even ignoring questions arising from the characteristics and behaviors of study recruits, the conduct of the research community itself erode substantially the credibility of these studies' conclusions. The extent of the problems vary across the business, economics, management and marketing research fields (with marketing especially afflicted). The problems are not getting better over time and are much more prevalent than in a comparison set of non-online experiments. We explore correlates of increased credibility.
Original languageEnglish
PublisherSSRN
Number of pages57
DOIs
Publication statusPublished - 12 Aug 2022

Keywords

  • online crowd-sourcing platforms
  • Amazon Mechanical Turk
  • p-hacking
  • publication bias
  • statistical power
  • research credibility

Fingerprint

Dive into the research topics of 'We need to talk about mechanical Turk: what 22,989 hypothesis tests tell us about publication bias and P-hacking in online experiments'. Together they form a unique fingerprint.

Cite this