We built LabintheWild in 2012 (together with Krzysztof Gajos at Harvard) out of a need to collect data from participants from various countries and ages. Since then, LabintheWild studies have led to more than 50 peer-reviewed publications in various research fields, from Human-Computer Interaction, Usable Privacy & Security, and NLP to Psychology and the Medical Sciences. A vast majority of these publications present findings for which a diverse participant population was essential and would not have been possible without LabintheWild (or at least substantial effort and cost). Examples include:
a comparison of visual website preferences across 43 countries (Reinecke and Gajos, 2014);
a study comparing the visual crowding in people with and without dyslexia using our Virtual Chinrest (a method that estimates viewing distance based on detecting a participant's blindspot) (Li et al., 2020);
a new NLP dataset with human-generated stories (August et al., 2020);
a new system for predicting how users will see their screen under varying lighting conditions with participants ranging between 5--94 years of age (Reinecke, Flatla, Brooks, 2016);
a comparison between blind and sighted people's listening rates and understanding of conversational AI (Bragg et al., 2018; Bragg, Reinecke, Ladner, 2021);
a motor performance study of the age-related effects of pointing performance with participants aged 5 through 85 (with a sample of more than 250k participants), which was used for predicting age-specific motor performance and Parkinson (Gajos et al., 2020); and
a psychology study published in PNAS that shows the impact of sex and age on cognitive empathy is similar across 57 countries with a sample of more than 300k participants (Greenberg et al., 2022).
How and Why LabintheWild Works
You've probably heard of Mechanical Turk, Prolific, and the likes. These are online services that researchers often use to recruit and pay participants for their online participants. LabintheWild is similar, but instead of financially compensating participants, LabintheWild participants are compensated with information about their performance and an ability to compare themselves to others. In other words, people can learn something about themselves. This design choice engages curiosity and makes it a rewarding experience for people. Compared to platforms that offer monetary compensation, LabintheWild removes many participation barriers (such as the need for creating an account) and motivates truthful responses and effortful participation.
Our studies show that the prospect of learning about oneself and comparing with others is key in motivating people to participate (Jun, Hsieh, Reinecke, 2018; Li, Gajos, Reinecke, 2018; Oliveira, Jun, Reinecke, 2017). Slogans framed accordingly maximize participation and can determine who participates (August et al. 2018). Related, we have evaluated how to effectively present the social comparison on the personalized results pages (Huber, Reinecke, Gajos, 2017) and how instructing participants using formal language leads to more attention (August and Reinecke, 2019).
Because LabintheWild relies on intrinsically motivated participants, it enables research that cannot be done in the lab nor by using existing platforms (e.g., MTurk, Survey Monkey, Qualtrics Surveys, Prolific). Here's why:
Larger sample sizes: Anyone can participate in LabintheWild studies without creating an account, which means that LabintheWild reaches more participants. For example, the vast majority of previous LabintheWild studies have attracted more than 1,000 unique participants each, several have had more than 30,000 unique participants each, and some have exceeded 250,000. What is important to know is that we don't usually invest in advertising. Instead, participants share the studies on social media and in online communities, such as to compare their results or talk about their experience. We are thankful for so many participants spreading the word! Research studies on MTurk, for comparison, can only reach an estimated maximum of about 7,300 unique participants, even if unlimited financial resources were available.
More diverse participants: Researchers have increasingly called for studies that involve more representative and/or diverse samples than the common American undergraduate samples. LabintheWild attracts more diverse participants in terms of age, geographical, and educational range than other platforms that require participants to be at least 16 or 18 years of age and compensate in specific currencies (thus reducing the appeal to people from other countries). For example, MTurk participants are primarily from the US (75%) and India (16%), and rarely include people with disabilities. Prolific's pool of participants mainly consists of participants from the UK and US. In contrast, LabintheWild has a lower number of participants from the US (around 30%) and instead attracts people from more than 200 countries. In previous studies, it has led to comparisons of people from various different countries, ages, education levels, and abilities.
Better data quality: LabintheWild participants are intrinsically motivated to perform well (i.e., they want to learn about themselves), which translates into better data quality. Financial incentives offered by other platforms often result in participants minimizing the time on a task in order to maximize their reward and has led to an increase in fraudulent responses, such as by duplicate accounts and bots. Researchers have therefore questioned the results of MTurk and Prolific studies due to data quality issues. We have shown in multiple studies that the data quality on LabintheWild is consistently better than that of MTurk (Ye, Reinecke, and Roberts 2017; August and Reinecke, 2019) and that it is at least as good as the data collected in controlled laboratory studies (Reinecke and Gajos, 2015; Li, Gajos, and Reinecke, 2018; Li et al. 2020).