Stanford Study Reveals Hidden Racial Bias in AI Language Models

A new study from Stanford University has exposed a troubling trend in artificial intelligence: as large language models (LLMs) become less overtly racist, they are simultaneously becoming more covertly racist, especially towards speakers of African American English (AAE).

The research, published in Nature, demonstrates that major LLMs, including those developed by OpenAI, Facebook AI, and Google AI, continue to perpetuate harmful racial stereotypes dating back to the pre-Civil Rights era.

The researchers employed a technique from experimental sociolinguistics called the matched guise technique to compare how LLMs describe authors of the same content written in AAE or Standard American English (SAE). Their findings reveal that LLMs are significantly more likely to associate AAE users with negative stereotypes from the 1933 and 1951 Princeton Trilogy studies, such as "lazy," "stupid," and "dirty."

The study also found that as LLMs become larger and more sophisticated, overt racism decreases while covert racism actually increases. This trend is particularly concerning as LLMs are increasingly incorporated into decision-making systems for employment, academic assessment, and legal accountability.

In additional experiments, the researchers demonstrated that compared to SAE users, LLMs are more likely to assign AAE users lower prestige jobs, convict them of crimes, and sentence them to death rather than life imprisonment for committing murder.

The research team emphasises, that simply not mentioning race to an LLM, does not prevent it from expressing racist attitudes. Valentin Hofmann, a postdoc at the Allen Institute for AI and co-author of the study, suggests that these biases likely make their way into language models via the people who train, test, and evaluate them.

Sign up for AI-360