Where does AI bias come from?

AI models learn patterns from data, including patterns that reflect historical biases, stereotypes, and inequities in that data.

Why do AI models exhibit bias?

Models learn from data. If that data contains patterns we'd rather not perpetuate, the model learns those too.

How Bias Enters Models

📚Human-Generated DataBooks, web, conversations

→

⚠️Historical Biases EncodedStereotypes in text

→

🧠Model Learns PatternsIncluding biased correlations

→

💬Bias in OutputsStereotyped completions

This isn't a bug that can be fixed with better code. It's a fundamental property of learning from human-generated data: the model becomes a reflection of that data, including its flaws.

What kinds of bias appear in AI models?

Several distinct types:

Representational bias: Some groups appear more often in training data than others. The model "knows more" about well-represented groups.

Stereotypical associations: If training data frequently associates certain professions with certain demographics, the model learns those associations. "CEO" might default to male; "nurse" might default to female.

Quality disparities: The model may perform differently for different groups. Autocomplete might work better for common names than unusual ones. Translation might be more accurate for well-resourced languages.

Amplification: Models can actually amplify biases present in data. If 60% of "doctor" examples in training data are male, the model might associate "doctor" with male 70% of the time.

Where does the biased data come from?

Training data for large language models is scraped from the internet, books, and other text sources. This data reflects:

Historical inequities: Literature from eras with explicit discrimination
Contemporary stereotypes: Online content isn't free of bias
Selection effects: What gets written, published, and preserved isn't a neutral sample of human thought
Overrepresentation: English-language, Western, and internet-user perspectives dominate

The internet reflects who writes on it

The training corpus is not a neutral sample of humanity. It overrepresents:

English speakers (especially American English)
People with internet access
People who write publicly (journalists, bloggers, forum posters)
Certain political and cultural perspectives that dominate online spaces

Groups less likely to produce public text (because of language barriers, lack of internet access, or cultural factors) are underrepresented. The model knows less about them and may default to stereotypes when filling in gaps.

How do AI labs try to address bias?

Several approaches:

Data filtering: Remove obviously offensive or biased content from training data. Imperfect; subtle bias remains.
Balanced sampling: Intentionally include diverse perspectives. Limited by what data exists.
RLHF: Train the model to avoid biased outputs through human feedback. Raters flag problematic responses.
Constitutional AI: Define principles the model should follow, including fairness guidelines.
Output filtering: Detect and block biased responses at inference time. A band-aid, not a cure.

None of these fully solves the problem. Bias mitigation is an ongoing process, not a one-time fix.

The measurement problem

Even defining "bias" is contested. Consider:

Is a model biased if it reflects true statistical patterns in the world? (More men are CEOs. Should the model know this?)
Is it biased if it ignores those patterns? (Should "CEO" be gender-neutral even though the role historically wasn't?)
How do we weigh different types of harm? (Stereotyping vs. erasing real disparities?)

Researchers develop benchmarks to measure bias, but the benchmarks encode assumptions. A model might "pass" one bias test while "failing" another that measures something slightly different.

What can you do as a user?

Be skeptical of defaults: If the model makes assumptions (about gender, race, profession, culture), question whether those assumptions are warranted.
Ask explicitly: Instead of letting the model assume demographics, specify what you want or ask it to consider multiple perspectives.
Notice patterns: If you see the same stereotyped outputs repeatedly, that's the model's training showing through.
Don't use AI for high-stakes decisions about people without human review: Hiring, lending, criminal justice, anywhere bias could cause real harm.

Sources & Further Reading

🔗 Article

On the Dangers of Stochastic Parrots

Bender et al. · Wikipedia · 2021

🔗 Article

Language Models and Bias

Google AI Blog · 2023

📄 Paper

Man is to Computer Programmer as Woman is to Homemaker?

Bolukbasi et al. · 2016