Skip to main content

Ratings Systems Amplify Racial Bias on Gig-Economy Platforms

Five-star rating systems from services like Uber and TaskRabbit may seem like a neutral way to help customers connect with the best workers. But according to a new Yale SOM study, the platforms can spread the effects of racial discrimination by displaying ratings from biased users to those who otherwise would not discriminate.

An illustration of a ride-share driver with thumbs up and thumbs down ratings emerging from the rear windows of his car.
Sean David Williams
  • Tristan L. Botelho
    Associate Professor of Organizational Behavior
  • K. Sudhir
    James L. Frank ’32 Professor of Private Enterprise and Management, Professor of Marketing & Director of the China India Insights Program
  • Fei Teng
    Doctoral Student in Quantitative Marketing

In 2020, a former Uber driver filed a class action lawsuit against the ride-hailing giant, alleging that the app’s star-rating system was leading non-White drivers to get kicked off the app. “Throughout its history, Uber has made firing decisions based on a system that it knows is poisoned with racial discrimination,” said Shannon Liss-Riordan, lead attorney for the suit.

In a statement to the press, Uber argued that the opposite was true: “Ridesharing has greatly reduced bias for both drivers and riders, who now have fairer, more equitable access to work and transportation than ever before.”

At first glance, rating systems may indeed seem like an impartial arbiter for the gig-economy ecosystem, providing predictability that boosts consumer confidence, making online services more useful, and driving up transaction volumes—which results in increased earnings for workers.

But according to a new study by Yale SOM doctoral student Fei Teng and faculty members Tristan Botelho and K. Sudhir, these systems are not as neutral as they appear. The researchers found that ratings systems have the potential to amplify existing bias from users. Even worse, the systems can channel that bias to customers who otherwise would not discriminate, leading to a disparity in ratings and earnings between White and non-White workers.

Crunching data from an online labor market that matches service workers with customer jobs, the researchers studied the impact of displaying ratings on subsequent reviews and earnings. They found that while many customers did not discriminate against non-White workers, some did by systematically canceling appointments with such workers or by both canceling and giving lower ratings to them.

“A key idea in our that paper is that not everyone has to discriminate for rating systems to produce bad outcomes for minorities. The paper allows for the fact that people differ in how they interact with minorities and we find that while some discriminate, a significant group of people are actually not biased at all,” Teng says. “But, interestingly, the unbiased folks are extremely sensitive to small differences in ratings as indicators of quality.”

Because customers perceive ratings as indicators of worker quality, the display of biased ratings has a spillover effect; specifically, the researchers found, it amplifies the ratings gap for non-White workers by 80% and increases their earnings gap on the platform by 28%.“When some people discriminate against minorities,” Sudhir explains, “future customers follow the same pattern of behavior because they believe these workers are worse performers. So even those who are unbiased cancel them more often and give them lower ratings. And those who are biased feel more justified to cancel more often and give even lower ratings to minorities, producing an amplification effect over time.”

There’s a naive view in the gig economy that by allowing people to express their reaction to a service, in some perfect utopia, that we uncover a real truth. But often these ratings actually are taking into account things that are unrelated to the quality of service.

Through this mechanism, a small number of biased customers can create systemic disadvantages for people of color in the gig economy. “It’s the bad apples spoiling the bunch,” Botelho says. He underlines the “hidden costs” from job cancellations. “There’s a lesser probability of getting work in the first place, so minority workers have to expend more effort to get the same number of jobs as their White counterparts, which further lowers their income.”

A growing body of research has demonstrated racial and gender biases in customer ratings on online platforms. “There’s a naive view in the gig economy that by allowing people to express their reaction to a service, in some perfect utopia, that we uncover a real truth,” Botelho says. “But often these ratings actually are taking into account things that are unrelated to the quality of service.”

The study also contributes to an ongoing discussion surrounding algorithmic discrimination, the researchers add. It underscores that if a system’s observable metrics are tainted by past discrimination, it reinforces and exacerbates existing structural inequities.

For example, on some platforms, Botelho notes, ratings are not just provided to other customers but also feed into a larger system that prioritizes access to work, causing workers to lose opportunities. “This process represents an algorithmic amplification of bias,” he says.

Designing more equitable algorithms is challenging, but the researchers suggest one possible approach. Their model allows them to estimate the effect of discrimination on ratings; it could also be used to determine the scores that would exist without biases in the system. “If we knew the differential that was unrelated to the worker, and we adjusted the ratings, we could level the playing field,” Botelho says.

Sudhir notes that the study’s methods can help illuminate the far-reaching influence of biases in education and even criminal justice contexts. “For example,” he says, “research shows that even for same infraction, minority students are sent more often to the principal’s office, or dealt with through the criminal justice system. But this creates a detrimental ‘record’ early in the minority person’s life that is subsequently used to justify harsher treatment, even by those who are not biased. And this can lead to systematic disadvantages that get amplified over time.”

Department: Research