Alcohol Free AI: Can we stop pretending AI can be “safe”?

The air-con is working overtime as a class full of of year nines stagger in from the yard after lunch. One or two of them recently discovered deodorant, so the smell of sweat is pasted over with an almost visible layer of Lynx Africa. Five minutes in, and it’s already so bad the students are asking to crack the windows, but outside it’s just topped 40 degrees C, and you’d rather not be responsible for cooking an entire class of teenagers.

So, you do what any good Australian teacher would do: you reach into the Esky under the teacher’s desk and pull out a couple of six-packs of alcohol-free beer. Cracking a couple of tinnies on a hot summer’s day is an Australian tradition, but you’re not a lawbreaker, and you’ve seen all of those government adverts urging you to drink responsibly. You’re going to give these kids a taste of safe booze.

Condensation drips down the sides of the icy cans. Chugging on their frosty beverages, the class finally starts to settle down. At last, you think, we can get some actual work done. Before you get into the lesson though, you give those kids a timely reminder:

“Remember kids, this beer is safe. All the alcohol was removed. Well, most of it. You can’t get drunk. Nobody’s going to wake up with a hangover. It’s unlikely to kick off a classroom brawl. I’m giving you this safe beer so you can learn how to handle alcohol when you get out into the real world.

“There’s no way around it, kids – on a hot day like today you’re going to be cracking tinnies. And when you’re a bit older, it probably won’t be alcohol-free like you’re drinking right now in the safe walls of this classroom. I’m going to help you learn appropriate and responsible alcohol consumption. Now, open your maths books to page 46 and start working through exercises one to twenty. If you run dry, give us a yell – I’ve got a few spare in the Esky.

For the benefit of my non-Australian readers, nobody’s actually giving 15-year-olds cans of alcohol-free beer in the classroom; not even on a 40-degree day. The idea is, of course, ridiculous.

Of course, we don’t actually give kids alcohol-free beer to prepare them for life after school. We talk to students in health classes and wellbeing programs about the potential harms of excessive alcohol consumption. The government runs adverts urging people to drink responsibly. Programs are put in place for people who, for whatever reason, can’t control their drinking habits. Drinking alcohol is limited to certain age groups, with some caveats such as countries where young people can drink in the company of adult family members. We have social norms around drinking, and we don’t pretend that there is anything like “safe” alcohol.

Yet, that’s exactly what I keep seeing time and time again with artificial intelligence in education.

What is “safe” AI?

First, we need to set some boundaries and decide what safe AI looks like in an education context. With young people using these platforms, there are a wide range of features that would be needed to make generative AI models safe. This isn’t an extensive list, but here are a few obvious features:

No outputting harmful or toxic content: AI chatbots used in education need to be sufficiently guard-railed to prevent them from producing content that can be considered dangerous, abusive, discriminatory, or otherwise harmful.
Unbiased: Generative AI systems are notoriously biased towards gender, race, religion, culture, disability, age, sexuality, and pretty much anything that strays too far from the white American male perspective. These models do not reflect the diversity of our students and therefore, generative AI deployed in education must account for bias.
Deliberate misuse: AI in education must be guarded against deliberate misuse, such as using generative AI chatbots to create content that is deliberately misleading, provides misinformation or disinformation, or can be used to exploit, bully, or harm others.
Safe advice: Generative AI is capable of giving unsafe or medically unsound advice regarding issues like mental and physical health, nutrition, exercise, and more harmful areas.

This is just a handful of places where developers need to be putting up specific guardrails to make an AI chatbot safe. But here’s the thing: even the biggest AI developers and deployers – companies like Open AI, Microsoft, DeepMind, Anthropic, and Google – consistently fail to produce robust enough guardrails on these models. They are inherently unsafe, inherently prone to bias, inherently prone to hallucinate or fabricate information, and notoriously easy to jailbreak.

Practical AI Strategies is available now from Amba Press

So why would developers within education systems, often building third-party applications on top of these foundation models, hope to do better?

Yet that’s exactly what appears to be happening. Just recently the NSW Department of Education released its NSWEduChat chatbot, trialling it in 16 public schools across the state. The chatbot is one of many similar efforts worldwide, including other states in Australia. According to an article in the Guardian, the chatbot “doesn’t have the same privacy concerns as other technologies”, and the NSW government page about the chatbot states that:

The safety of our users is paramount. From stringent content moderation to ethical AI practices, we take comprehensive measures to ensure that NSWEduChat remains a safe and beneficial educational tool.
https://education.nsw.gov.au/teaching-and-learning/education-for-a-changing-world/nsweduchat

Safety is mentioned again on the performance and benchmarking page, and has a dedicated “Safety and Optimisation page” which details the methods used to safeguard the technology, including profanity filtering, “jailbreak prevention”, and “constitutional alignment”.

These are all efforts which should be applauded – there appears to be constant monitoring, deliberate red-teaming, and efforts to encode the values of the state curriculum and education system into the model. Hopefully, as the trial progresses, we’ll have more transparency over the methods used to enhance the safety features and guardrails, and exactly what kind of red teaming and jailbreaking attempts have been made, and guarded against.

But how can we expect these chatbots and their guardrails to be maintained and kept “safe”, when even the slightest tweaks to the underlying models can produce a cascade of unexpected side effects? Any safety features or guardrails built into an application constructed on top of a large language model like GPT becomes potentially useless when the developer changes a setting, rolls out an update, or adjusts a parameter. This was evidenced very recently when OpenAI’s GPT went off the rails, and again with Google’s Gemini model, and particularly its image generation.

Are we expected to believe that education systems are going to be able to maintain and update their safety features and guardrails on custom-built education chatbots when OpenAI and Google can’t even maintain their own?

The bigger issue, however, is that this kind of safeguarding distracts us from the fact that these technologies are inherently unsafe. We shouldn’t be lulled into a false sense of security with any kinds of AI systems, including Generative AI. Instead of trying to guard against every possible negative outcome, we need to decide when “safe enough” is safe enough, and educate students (and teachers) about the real potential harms.

When is safe enough, safe enough?

Let’s come back to the alcohol analogy, and extend it to other harmful things like driving and drug use. We could perhaps place these on a continuum, with class-A drug abuse on one end and driving on the other. Obviously drug abuse is dangerous, but so is driving, and particularly for young people. So, we have age limits, legal requirements (such as minimum driving hours, theory and practical tests), and driver’s licenses. Driver’s licenses can also be revoked for committing crimes while behind the wheel of a car. But on the whole, as a society, we accept that driving is safer than taking drugs.

Somewhere along this continuum, we’re going to have to decide where generative AI technologies and artificial intelligence more broadly fit. Is it, for example, less harmful than getting behind the wheel of a car? Is there less risk with a 13-year-old using an AI platform, than with that same 13-year-old drinking a glass of wine? Those decisions are partially the job of regulation and partially the job of society through setting norms around these technologies. But right now, it’s far too early to fully understand where on the continuum generative AI sits.

GenAI and Social Media

A more obvious comparison is to look at generative artificial intelligence alongside social media. It took roughly a decade for society to catch up with the harms presented by social media platforms like Facebook – things like cyberbullying, the potential for spreading bias and misinformation, and even the capacity of these platforms to sway political processes. Regulations around social media were laughably slow, and in some countries still non-existent. And in the meantime, the owners of these companies have become fabulously, ludicrously rich.

Many of these social media executives, funders, venture capitalists, and CXOs have taken those earnings and gone on to be prominent figures in artificial intelligence. Zuckerberg shifted the focus of Meta away from Facebook and first into VR, and then artificial intelligence. Sam Altman, CEO of OpenAI founded his own social media platform early in his career (the failed ‘Loopt’), then went on to become partner at Y Combinator, the ‘startup accelerator’ which has invested heavily in social media and related technologies. Elon Musk, one of the original founders of OpenAI, has gone in the other direction and taken over one of the world’s most popular social media platforms.

Structurally, and financially, GenAI has deep roots in social media.

We were slow to realise that social media is inherently unsafe. Slow to realise that it’s not the users’ fault for becoming addicted to social media. To realise it’s not the user’s responsibility to safeguard their private data or to read pages and pages of Terms and Conditions. And so we let social media companies self-regulate, and finally we realised that they weren’t particularly interested in minimising those harms.

The Alternative

What is the alternative? I don’t think it’s to ban or block these technologies. And given their growing ubiquity, it’s highly impractical to even attempt to stop students from using them. Without wanting to sound too deterministic, it’s almost inevitable at this point that most texts created in a digital context and many interactions with computers in the future will be underpinned by some form of generative AI.

The only alternative, of course, is education. Deciding when “safe enough” is safe enough means coming to faster conclusions about the risks and social norms surrounding generative AI. Everyone involved in education – schools, universities, parents, staff, and the broader community – needs to be aware of the potential risks, and to educate students accordingly. Yes, we need measures in place to mitigate those risks like the efforts of the NSWEduChat safety. But we should never claim that these efforts make the technology “safe”.

Students need to be aware that these technologies are biased, are capable of producing misleading and discriminatory content, and can be used in harmful ways. Even if a chatbot went completely rogue and started to swear at its users, chances are high that most kids will have heard worse in the playground by the time they hit mid-primary school. But the racist, discriminatory, and sexist comments, or any other forms of toxic content that these machines are prone to when the guide rails fail, could be more distressing – particularly to students of minority groups.

The real harms of generative AI are insidious, and not as immediately apparent as the physiological or mental effects of alcohol, or the risks of driving. Complex ethical concerns underpin the technology, from development to the very infrastructure that generative AI is built on. It’s not just about whether ChatGPT produces sexist content, but rather the fact that the datasets these models are constructed on are inherently sexist. That is the problem. Going back a step, you could even argue that the bigger issue is the demographics of Internet users and the preponderance of discriminatory content online.

Luckily for educators, this is nothing new. Many schools have been running programs on cybersafety, digital consent, appropriate internet use, cyberbullying, and how to report criminal activity online for years. We can also put in place clear and transparent audits of these technologies, using tools like the Education Futures Studio possibilities matrix, which suggests alternatives to ubiquitous and potentially harmful technologies. Instead of pretending artificial intelligence can be made safe, we need to react quickly to these technologies and put in place measures to help students understand exactly why they are inherently unsafe…and how to use them anyway.

If you have questions or comments, or you’d like to get in touch to discuss GenAI consulting and professional development, use the form below:

Leon Furze