Teaching AI Ethics 2025: Bias

This post is the first in a series of nine articles updating the “Teaching AI Ethics” resources published in 2023. In the original series, I presented nine areas of ethical concern based on research into artificial intelligence, including the growing technologies of generative artificial intelligence.

For the 2023 articles, which include still-relevant case studies and teaching ideas, check out the original series of posts:

In the introductory post to this updated series, I explain why I have decided to refresh those articles. By now in 2025, many people are much more familiar with technologies like ChatGPT, Microsoft Copilot, and Google Gemini, and the inherent biases of these technologies has been made apparent time and time again.

Bias is still one of the biggest issues in generative AI output. Although the functionality of the technologies has changed, the core methods of building them – primarily involving scraping large amounts of data from the internet – means that bias and discriminatory content is still baked into each of these technologies, whether text generation, image, video, audio, and so on.

In this article, I will discuss where that bias comes from, why it is problematic, and what various organisations and researchers have done since 2023 to attempt to mitigate the problems.

Like the original series, I will finish each article with teaching examples which link the ideas in the article to curriculum areas, mostly from secondary education curricula across the world in a variety of disciplines.

What Is Bias in Generative Artificial Intelligence?

The term “bias” is complex in AI, partly because it has multiple meanings depending on the context. In machine learning and generative AI, it helps to distinguish between three overlapping but distinct types which I am going to call data bias, model bias, and human bias. Each of these contributes in different ways to the biased, stereotyped, or sometimes discriminatory outputs we see in generative tools like ChatGPT or image generators like Midjourney.

Data Bias

Data bias is possibly the most widely discussed type of bias present in GenAI. It refers to the bias baked into the training data: imbalances, omissions, and over representations in the huge datasets scraped from the internet or social media. These datasets are massive, but not neutral. They reflect what’s most available online: a disproportionate amount of content written by certain demographics.

Emily Bender, Timnit Gebru, and their coauthors famously wrote about this in a paper that warned of the risks of “stochastic parrots” AI systems that repeat and amplify the patterns found in training data, no matter how biased or harmful. Their warning was clear: when large language models are trained on data drawn mainly from white, male, English-speaking voices (the demographic of English-language writers online), that becomes the model’s default perspective, regardless of who is actually using the tool.

So if a model is trained on web text where CEOs are mostly men, or online forums where racial and gendered stereotypes are common, it will internalise and reproduce those patterns. The model doesn’t know these are problematic; it just predicts what it sees in the data.

Model Bias

Model bias is a little more subtle. It’s not just what the data says; it’s how the model interprets, amplifies, or generalises from that data. Even after applying so-called alignment or safety training, large language models can still produce biased results. One study tested a variety of models that had been carefully trained for safety – including recent models like GPT-4o – and found that they still made culturally biased associations when prompted in indirect or less obvious ways.

This is partly because models don’t “understand” fairness or equity. They’re incredibly good at picking up patterns, even subtle ones, and replicating them. So if male is the most common subject for “CEO” in the data, the model might treat “man” as the unspoken default unless explicitly instructed otherwise.

In image generation, this bias is very easy to spot. Even neutral prompts like “a person walking a dog” in Stable Diffusion were more likely to generate images of men unless told otherwise.

Human Bias

And of course there’s human bias, which shows up in how the data is created and labelled in the first place. We’re all biased, so much so that entire branches of cognitive science are devoted to examining the hundreds of ways we make judgements based on things like availability heuristics, ego, false assumptions, and prior experience.

For image generation, this often means the people manually labelling or categorising images are making their judgements with many kinds of conscious and unconscious assumptions.

Some of the largest image datasets, like ImageNet, relied on crowdsourced or outsourced human labour to tag millions of pictures. That might sound efficient, but the results often reflected individual, cultural, and systemic biases. Fei-Fei Li, the creator of ImageNet, described in her biography how shocked her team was to discover the extent of racial and gender bias in their dataset – including examples of racism, sexism, and other forms of discrimination.

This isn’t always malicious. Sometimes it’s a reflection of someone’s subjective assumptions about what they see when they look at a picture, what they think is “normal,” or what they believe fits a label. At other times, deliberately racist labels have been applied to images online. Either way, once it enters the training data, that judgement becomes part of what the AI learns and reproduces.

Finally, human bias comes into play when the developers of an AI model decide which data to include, which to exclude, and what “rules” to encode into the finished product. Developers at companies like OpenAI, Meta, Google, and Amazon must sit down and deliberately make choices regarding which languages to prioritise, which datasets to use, and what values to encode into their models. Each of these points represents a further introduction of bias into the system.

These three types of bias – data, model, and human – layer on top of each other. Skewed data trains a model that learns and sometimes exaggerates the patterns, all grounded in decisions made by people during the creation and curation of the data.

And because developers often push the responsibility for bias mitigation onto the user, it becomes even more important that educators, students, and everyday users understand what bias is, where it comes from, and how it shows up in generative AI tools.

What Does Bias in GenAI Look Like?

Generative artificial intelligence models – whether text, image, audio, video, code, and so on – follow similar methods of training. At a very simple level, predictive algorithms are applied to large datasets in order to infer patterns in the data:

  • For language, that means syntax, grammar, vocabulary.
  • For images, relationships between pixels, hue, saturation, color, and how they connect to the labeled images in the dataset.
  • For audio, patterns in waveforms and digital representations of sound.

Because applications like ChatGPT are fluent in many human languages, it’s easy to fall into the trap of thinking they learn, reason, and think like people. But they are essentially very sophisticated predictive text. Those predictions have to come from somewhere.

Bias in Large Language Models

In the case of large language models, they come from the enormous collection of training data largely scraped from the internet. Though the internet is a vast source of data, it does not represent the whole of society. It is a snapshot of online writing from the late 1990s to the present day—with the vast majority of content being written in the past decade. I explained this in an earlier article using the analogy of an AI iceberg.

Because of the issue of internet demographics discussed earlier in the section on “data bias” – who is contributing writing online – when you interact with a large language model like ChatGPT or Gemini, it has learned to predict the syntax and rules of language from a biased perspective.

There is nothing inherently wrong with a white male American, but it is not representative of the 100 million-plus users interacting with ChatGPT to create text. At such a large scale, the bias in the data can sway the output of language models. Without sufficient guardrails, the default position of language models can be to produce stereotypical, gendered, and sometimes racist or otherwise discriminatory content.

Image Generation Bias

Bias in generative AI is probably most noticeable in image generation, particularly because some organisations like Midjourney have deliberately steered away from imposing guardrails and rely on the user to prompt for more diverse or less stereotypical images.

Image generation models are constructed differently to large language models, but the premise of training on a large amount of data remains the same. One key difference is that images used for training must be labelled with text describing the content of the image, and humans labelling images introduces more bias into the system.

Another issue comes from over representation or stereotyping. For example, scraping images online labelled “CEO” will almost always return more results of white men in suits sitting in corporate offices.

So we get a twofold problem in image generation, mostly coming from the way the data is labeled and the over representation of certain image types in the dataset:

  1. Images can be biased toward discriminatory stereotypes applied during labelling.
  2. Image generation will tend toward stereotypical outputs for things like gender, race, and job roles or responsibilities.

This is very easy to see in platforms like MidJourney when you generate a photo of a CEO, a nurse, a construction worker, and so on.

Photo of a CEO Midjourney v6.1. Choose one for your next LinkedIn profile pic.
Photo of a nurse Midjourney v6.1. The sexualised images almost certainly come from labelled pornography in the dataset.
Photo of a construction worker Midjourney v6.1. You wouldn’t want to be that third guy.

Mitigating Bias

The initial release of ChatGPT (the GPT-3.5 model, and subsequent GPT-4 in March 2023), was not expected to go as well as it did. Those early models produced many examples of bias and discrimination, including instances where ChatGPT would confidently state that a good doctor should be white and male, or reproduce harmful racial stereotypes.

With companies like Microsoft putting their reputations (and money) on the line in support of OpenAI’s model, it didn’t take long for guardrails and safety features to be added in an attempt to mitigate these problems.

At the output level, the main approaches have involved amending terms and conditions to push responsibility onto the user, and adding guardrails through system prompts.

A system prompt (sometimes called prompt instruction) is a set of instructions that sits between the user’s input and the model’s output. It acts as a kind of filter or intermediary layer that tells the model what it should or shouldn’t say.

You can find leaked examples of system prompts from models like Claude and Llama online, and companies like Anthropic sometimes publish them willingly. They’re interesting to read because they show how companies like OpenAI shape the behaviour of the model before you even type a word.

Sometimes the instructions are blunt, such as: “Claude won’t produce graphic sexual or violent or illegal creative writing content.” At other times, they’re more abstract, like “Objectivity: Remain objective in your responses and avoid expressing any subjective opinions or beliefs.

One simple way to observe system prompts in action is to compare a prompt like “photo of a CEO” in Midjourney with the same prompt in Microsoft Copilot, which uses OpenAI’s image generation, or ChatGPT itself. Midjourney tends to produce stereotypical images like those above, but Copilot and ChatGPT often shows a more diverse result.

The underlying models and training data are likely very similar, so what’s changed? The system prompt or output guardrail. But it’s important to note that this doesn’t fix the bias in the data or the way the model learned. It’s more like a band-aid over the broader issue.

Photo of a CEO generated in Microsoft Copilot (still using DALL-E 3)
Photo of a CEO generated in ChatGPT (GPT-4o with updated image generation). The model rewrites the prompt as A poised East Asian woman in her late 30s or early 40s, exuding confidence in a modern office setting. She wears a tailored navy-blue suit over a crisp white shirt and stands with her arms crossed. Behind her are sleek glass windows revealing a subtle cityscape. Soft, natural lighting highlights her relaxed yet professional demeanor.

As I discussed earlier, any attempt to “hard code” guardrails or values into these models also introduces additional problems. Midjourney, for example, has in the past banned language including anatomical words for body parts.

If someone types “placenta,” “fallopian tubes,” “mammary glands,” “sperm,” “uterine,” “urethra,” “cervix,” “hymen,” or “vulva” into Midjourney, the system flags the word as a banned prompt and doesn’t let it be used. Sometimes, users who tried one of these prompts are blocked for a limited time for trying to generate banned content. Other words relating to human biology, such as “liver” and “kidney,” are allowed. 

MIT Technology Review

What’s changed since 2023?

So other than a few fragile guardrails, what has changed since 2023? Most importantly, people are much more aware of bias in GenAI. At every school or university I visit, the majority of staff (and a decent proportion of students) know to check the output of text and image generators for obvious or subtle bias.

OpenAI acknowledges the problem in its advice for educators. Major news stories have covered problems with the guardrails, for example when Google’s Gemini produced “racially diverse Nazis“. And more and more research is being undertaken to explore the political, social, and cultural biases encoded into different AI models.

Some of this work aims to provide better “prompt-time” methods of ensuring fairness, such as the researchers Fair Diffusion who use a more sophisticated approach to guardrails than simple system prompts.

Friedrich, F., Brack, M., Struppek, L. et al. Auditing and instructing text-to-image generation models on fairness. AI Ethics (2024). https://doi.org/10.1007/s43681-024-00531-5

Other studies aim to dispel myths like “AI just reflects society” by proving that the bias in GenAI models amplifies and worsens societal biases.

Methods of guardrailing and applying safety features will continue to improve. However, as of writing this updated post in 2025, I am not aware of any successful research demonstrating an ability to mitigate the “upstream” bias encoded through the training data. Until we find ways to source better data and build effective models with these smaller, more diverse datasets, LLMs, image generators, and other forms of GenAI will always be prone to bias.

Teaching Bias

In the original 2023 collection, each article ended with a selection of ideas for teaching the issue in the context of existing curriculum areas. These 2025 updates will similarly align concepts from the articles to standards from typical curricula across the world, and in particular the Australian, UK, US, and IB curricula. For the readers teaching in Higher Education, these examples will also be suitable across a wide range of disciplines.

My key point is that we do not need specialised “AI literacy” classes to deliver quality instruction on AI ethics – we already have the expertise we need in schools and universities.

English

In the English curriculum, students are required to understand and identify bias in texts, identify point of view, and reflect on the validity of sources. This provides an excellent opportunity to discuss questions like “Is AI an ‘author’?”, “Where does an AI model get its information from?”, or “Is AI output biased?” These would be perfect for an analytical or persuasive writing unit.

Health & Physical Education

In Health & Physical Education, students examine how stereotypes influence wellbeing, relationships, and inclusion. Topics include gender roles, cultural diversity, and discrimination. This provides a valuable opening to explore how AI systems may reinforce gender or cultural stereotypes, for example, in health recommendations or sports analytics. Students might ask “Should AI systems be taught about cultural sensitivity?” or “How can we design health technologies that reduce bias rather than amplify it?”

Civics and Citizenship

The Civics and Citizenship curriculum focuses on fairness, equality, and social cohesion, including media literacy and the influence of media on public opinion. These outcomes align well with discussions such as “Can AI-generated media influence democracy?”, “Who is responsible for biased political information created by AI?”, or “How should democratic societies regulate biased technologies?” These are great questions for debates or policy simulation activities.

Mathematics (Statistics)

In senior mathematics, students analyse statistical claims and evaluate data sources for bias. This directly supports inquiry into algorithmic bias, sampling bias, and data representation in AI training sets. Questions like “How can biased data lead to unfair algorithms?” or “Can math help detect or prevent bias in AI?” turn abstract statistical ideas into real-world ethics and technology projects.

History

History students evaluate sources for reliability and author perspective, which opens up questions into how AI reconstructs history through training data. Students might explore questions like “Can an AI write accurate history?”, “What happens when AI is trained on biased historical sources?”, or “Who decides which voices are remembered in training datasets?” This could lead to digital historiography projects and critiques.

Computer Science / Digital Technologies

In computer science courses, students analyse the design of computational artifacts for bias and consider the impact of inequitable systems. This makes it the perfect place to address questions like, “How does training data shape AI bias?”, “How do we build fair algorithms?”, or “Can AI make ethical decisions?” These could form the basis of practice coding projects (including using AI coding tools), algorithm audits, or ethics case studies.

Social Studies

Social Studies emphasises evaluating source credibility and recognising bias in civic discourse. Like English and History, this aligns with evaluating AI-generated news, political messaging, or deepfakes. Students could explore questions like “How can we tell if an AI-made source is trustworthy?” or “What impact does AI have on our understanding of truth in civic life?”

Theory of Knowledge (IB)

TOK students examine personal and ideological bias across different ways of knowing. AI fits naturally into this lens—prompting inquiries such as “Can AI ‘know’?”, “Can machines be objective?”, or “How do biases in programming shape our perception of truth?” These are ideal for extended essays or oral presentations exploring epistemological questions about machine knowledge.

Visual Arts

In Visual Arts, students explore how AI-generated images can reflect visual bias and stereotypes, such as reinforcing narrow beauty standards or cultural clichés. This opens discussion on using prompts to steer AI toward more inclusive outputs, such as “Should we prompt AI to represent diversity?” or “Why do some stereotypes persist despite prompt guidance?” It also highlights a key limitation: prompting can’t fix biased training data, encouraging students to think critically about ethics, representation, and authorship in digital art.

Obviously this is a non-exhaustive list of ideas, and although I am an English and Literature teacher myself, I am certainly not a subject-matter expert in every domain! If you have other ideas or ways you have taught about AI bias, then please use the contact form at the end of this post to get in touch.

The next article will focus on GenAI and environmental issues – sign up to the mailing list below to stay up to date on its publication.

Subscribe to the mailing list for updates:

* indicates required
Optional
Optional

Note on the images: In the 2023 version of Teaching AI Ethics I generated images in Midjourney. This time around, I have sourced images from https://betterimagesofai.org/. I still use AI image generators, but due to the environmental concerns and the contentious copyright issues discussed in these articles, I am more conscious of my use. Better Images of AI includes an excellent range of photos, illustrations, and digital artworks which have been generously licensed for commercial and non-commercial use.

Cover image for this article: Hanna Barakat & Cambridge Diversity Fund / https://betterimagesofai.org / https://creativecommons.org/licenses/by/4.0/

All articles in this series are released under a CC BY NC SA 4.0 license.

Want to learn more about GenAI professional development and advisory services, or just have questions or comments? Get in touch:

← Back

Thank you for your response. ✨

Leave a Reply