Large Language Models (LLMs) Show Increase in Covert Racial Bias with Human Intervention

3 min read

A recent study has uncovered that the expansion of large language models (LLMs) is correlated with the emergence of more discrete forms of racial prejudice. From the outset, it has been apparent that LLMs, such as ChatGPT, are exposed to racist perspectives due to the extensive internet content they are trained on. In response, developers have endeavoured to reduce their toxicity. Nevertheless, a recent study indicates that these endeavours, especially as models grow in size, merely tackle overtly racist perspectives, while allowing more concealed stereotypes to flourish and gain strength.

When prompted to evaluate speakers using African-American English (AAE), five AI models, including OpenAI’s GPT-4 and older models from Facebook and Google, displayed a tendency to attribute negative adjectives such as “dirty,” “lazy,” and “stupid” to AAE speakers as opposed to their Standard American English (SAE) counterparts, even when conveying the same meaning. Additionally, the models associated AAE speakers with less prestigious occupations or sometimes did not associate them with any job at all. Moreover, when tasked with making judgments regarding a hypothetical criminal defendant, the models were more inclined to recommend the death penalty for AAE speakers.

Leading companies like OpenAI, Meta, and Google have been incorporating feedback training to eliminate hateful views from LLMs, a process known as “alignment.” This process involves manual adjustments by human workers to refine the way the model responds to certain prompts, aiming to ensure alignment with desired values. While this approach has successfully addressed overt stereotypes, the recent study published on arXiv, yet to be peer-reviewed, indicates that it fails to tackle the subtle stereotypes relating to dialect prejudice, indicating a flaw in the current bias-solving strategies employed by leading companies.

The study also underscored that as the size of the models increases, covert stereotypes become more deeply ingrained. This finding serves as a warning to developers of chatbot models, such as OpenAI, Meta, and Google, as they strive to release larger models. While larger models generally demonstrate increased power and expressiveness, the exacerbation of covert racial bias necessitates more sophisticated tools to combat this issue. It remains unclear whether incorporating more AAE into training data or enhancing feedback efforts will suffice.

The use of AI models in critical decision-making processes is not just a possibility in the distant future but is already a reality. AI-driven translation tools are utilised in asylum cases in the US, and crime prediction software has been employed to determine whether teens should be granted probation. Additionally, employers utilising ChatGPT to screen applications may inadvertently discriminate against candidates based on race and gender, while bias against AAE could lead to misjudgments if models are used to analyse an applicant’s use of language on social media.

It is evident that as the capabilities of LLMs continue to expand, the issue of racial bias in AI models demands greater attention and more effective solutions. The study’s authors have shed light on the potential implications of racial bias, urging the need for a more comprehensive approach in addressing this pressing concern.

+ There are no comments

Add yours