AIAS: Why We’ve Driven Through the Traffic Lights

Author’s Note: There are four authors for the original and updated AI Assessment Scale (Lead author Assoc. Prof. Mike Perkins of BUV, me, Dr Jasper Roe of JCU Singapore, and Assoc. Prof. Jason MacVaugh of BUV)

In the series of blog posts which will follow, I am expressing my thoughts on the AIAS: though the four of us align on the AIAS I cannot speak for the other authors. For the reflections of all the authors, please read the published and forthcoming academic papers, or the other authors’ own publications and posts.

Last week, I made a post about our updates to the AI Assessment Scale. These updates are the culmination of using the scale in K-12 and higher education for just over 12 months.

We were also conscious that many schools and universities have already adopted the scale, and we wanted to keep our new version close to the previous so that it did not create a burden for educators choosing to transition from one to the other.

But there is, of course, one immediately obvious change which has already stirred up the most commentary: the move away from the red to green, “traffic light” color scheme.

In this post, I’m going to explain a little more of why we have chosen to replace the traffic lights with what I’ve been informally calling the “bubble gum colors” (This term might cause my co-authors some amount of horror, but I’ve said it now so you can’t unsee it).

From Traffic Lights to Bubble Gum

“Version zero” of the AI Assessment Scale had neither clear colours nor the recognisable tabular structure of the published version. In fact, it was a very off-the-cuff affair; a five-point Canva template with a few emojis to indicate tasks at various levels. It was a stop-gap attempt to address the immediate issue of the use/don’t use (or ban and block) approaches to AI in education in early 2023.

But the published and incredibly popular version, the version that most readers will recognise, used the red to green traffic light colours from the start. Similar traffic light models have emerged over the past 12 months, some adaptations of our AI Assessment Scale, others developing in parallel but not necessarily influenced by our scale.

The connotations of the traffic light colours, however, are clear in every instance of its use. Red means stop, and green means go. Or in the context of artificial intelligence and assessment, red means “do not use”, and green means use freely. In the middle three levels, traffic light systems have an amber level, meaning variously “use with caution,” “use with discretion,” “use critically,” or “some uses but not others are permissible.”

In our original AI assessment scale, we felt the need to break down the yellow level into three distinct categories, allowing for idea generation, editing, and more collaboration with AI systems. We still believe that it is necessary to differentiate between the ways in which educators and students might use artificial intelligence in a given assessment, and I’ll return to that later.

But there are also some unfortunate consequences of the traffic light system that we felt necessary to address.

Breaking the Hierarchy

One problem of the traffic light system is that it visually represents a hierarchy moving from Red through Amber, Yellow, Yellow-Green to Green, and indicating, perhaps to some, a preferential order or taxonomy of skills.

The suggestion in this reading of the AI Assessment Scale is that using AI for idea generation is better than no AI, using AI for everything up until the editing is better than just generating ideas, and so on. It also perhaps suggests a taxonomy of the skills outlined in each level, maybe suggesting that idea generation or brainstorming is a lower-order skill than editing and evaluating, or that critical and creative thinking (formally focused on in level four) are more important skills than note-taking and organisation (down in level two).

I’m using a lot of hedging language here, lots of “perhaps” and “maybes” and “suggesting,” because these are all interpretations of the AI assessment scale, and not necessarily our intent as authors. Both as a student and teacher of literature, I understand that the moment you write something and put it out into the world, the reader’s response is as important, if not more, than the author’s intent.

By putting the AIAS out into the world as a traffic light system, we made a design choice. We made a good design choice, since it is striking, memorable, and easy to associate the colours with the different levels, but we also opened the scale up to interpretations of a hierarchy which we don’t believe exists.

Red Means Stop!

The other aspect of the traffic lights we wish to move away from is the suggestion that the AI Assessment Scale can be used to stop or limit the use of AI. We recognised very early that permitting any AI use essentially permits all AI use, and we never encouraged schools or universities to treat the AI assessment scale as a tool for stopping, catching, or detecting the use of AI.

We tried to make that clear in the original article, where we discussed that the AI Assessment Scale is intended to be used at the beginning of an assessment process – a discussion with students over whether the use of AI is appropriate for a given task. And when we say appropriate, we mean, is it a good use of the technology or a good use of your brain to use AI in this situation?

Again, once you put something out into the world, it can be interpreted in many ways. Some chose to adapt the AI Assessment Scale and try to use it to benchmark the amount of AI use (0% at No AI, 20% at level two, and so on). But the reality is, if you permit students to use AI for brainstorming but ask them to refrain from using it in the final editing of their work, there is no way to prove that students have stuck to those rules. It is essentially nothing more than an honour code (and though they’re useful, they of course don’t stop all students from cheating).

In the new version, we want to make it clear that “No AI” does not equal a glaring red stoplight. “No AI” should be a decision based on what learning needs to be assessed at that moment in time. It should be clearly articulated to students that the reason this given task excludes AI is because it is a bad use of the technology or a bad use of their brain…

For the other levels, just like we are avoiding a taxonomy of skills, we also want to avoid the idea that a student will automatically stop using AI at a given point, because we know that assessment processes are complex and iterative. In one of my earlier books, “Practical Writing Strategies,” co-authored with Benjamin White, we make very clear that the writing process, for example, is a looping, iterative process.

The AI Assessment Scale is not and never was a tool designed to stop or limit the use of AI. It is a scale designed to help educators articulate and teach the appropriate use of AI. Hence, at level two, we design assessment tasks where students are permitted to use AI but have also been explicitly taught how to use artificial intelligence as part of the initial design and research stages of their work.

Over time, I will write posts giving extensive examples of what this might look like across disciplines. But for now, here’s a brief idea:

Already research tools like Elicit are available to use artificial intelligence for searching for academic articles based on research questions and synthesising abstracts. These systems are far from perfect, but they offer an interesting alternative or addition to a traditional library search. Whilst traditional library searches require an understanding of complex search terms and Boolean operators (AND, NOT, OR), these AI-powered research tools use semantic understanding and can search based on natural language research questions.

In a first-year university course, you might want students to rely on “old fashioned” research methods and discourage the reliance on AI-powered tools. But by the second or third year of a university course, I would certainly expect students to be experimenting with and learning the strengths and limitations of these AI library tools. In order for students to do that and do it well, they need instruction. And in order to appropriately instruct students on how to use those tools, educators must first be given “permission” to use AI in the classroom.

This is what the AI assessment scale allows for. I’d even go as far to say that level one, which excludes the use of AI, is still a permissive level. Level one gives permission for educators to say, “We don’t have to pour that AI special sauce all over everything. Some things are best taught without AI. Some things are best taught without any technology at all.”

We give the students permission to explore their ideas to the fullest without the distractions of technology. We give permission for long, thoughtful discussions and interactions and engagement, arguments and debate. We give permission for the broadly acknowledged importance of sometimes picking up a pen and writing thoughts by hand as opposed to hammering them out via a keyboard or prompting for them via a chatbot.

I’d encourage anyone using the AI Assessment Scale to consider what each level opens up as opportunities for students to explore technologies or methods of demonstrating their learning.

Hubba Bubba

The last question, then, when we decide to move away from the traffic lights is, what should we replace it with? We had a few options here.

We could, of course, go with a plain text, black and white table, but that doesn’t offer much distinction between the levels and as a design choice, it does nothing to make the AI Assessment Scale memorable or visually appealing.

We considered a gradient, but if the contrast between levels is not steep enough, a gradient is less accessible to partially sighted readers. (The gradients between the yellow levels in the first version of the scale, for example, was not steep enough and problematic for some readers who found that levels two, three, and four blended together.)

Taking both the non-hierarchical nature of the updated scale and the visibly distinct levels means we need clearly alternating colors. And moving away from the traffic lights means choosing colors which have a less immediately apparent symbolic connotation.

Enter the bubble gum pastels.

Co-authors Mike Perkins and Jasper Roe, and their colleagues, have already experimented with a pastel version of the AI Assessment Scale in a recent publication on the AIAS for English for Academic Purposes. They write:

We recognise that the broad range of tasks present in EAP assessments means that these identified scale levels might be used at different times and for different purposes throughout EAP education, and that no level of the scale is inherently ‘better’ or more suitable than another. We have therefore adjusted the original colour scheme presented in the AIAS from a Red-Green scale to a more neutral palette of colours.”

That pastel colour scheme was the first step in the direction away from the traffic lights, and as we later noticed, harkens back to the original colour schemes on emoji AIAS version zero.

We laboured over the exact colours to the point where I can confidently rattle off the hex codes for each of the levels. We tested the colours against online accessibility tools in Adobe, we used Claude to create an artifact to test the accessibility (you can try it out for yourself here), and we spoke to partially sighted and colourblind individuals and asked them if they could distinguish between the levels and whether the text was legible against the various colour backgrounds.

Colours and fonts galore from the Design Assets pack, which we have made freely available via this link.

For consistency, we chose colors where the text did not have to change from black to white, which meant that on the darker colors, we needed to be selective about the brightness and the hue.

We are not any of us graphic designers. So we also enlisted the help of the Fiverr freelancer I have used in the past to compile my blog posts into free PDF ebooks. Even after that design work, we continued to play around with the colours (“play” is a strong word here for the hours of squinting and painstaking Canva work).

The Practical AI Strategies online course is available now! Over 4 hours of content split into 10-20 minute lessons, covering 6 key areas of Generative AI. You’ll learn how GenAI works, how to prompt text, image, and other models, and the ethical implications of this complex technology. You will also learn how to adapt education and assessment practices to deal with GenAI. This course has been designed for K-12 and Higher Education, and is available now.

Goodbye, Traffic Lights

Honestly, I think this is the longest post I’ve ever written about choosing five colours. The content of the levels is probably more important from a pedagogical point of view than the design decisions. But we know that the first and most obvious change to the AIAS is the new colour scheme, and we know that people responded well to, but also sometimes misinterpreted, the traffic lights.

If you’ve read this far through my winding discourse about skill taxonomies, accessibility, and hex codes: Congratulations!

And a final request from me: The AIAS is used the world over. It pops up in surprising places, and sometimes the other authors and I are not involved in the conversations and contexts where it is used. That’s fine, but if you see criticisms of the scale based on some of the misinterpretations I’ve listed in this article, particularly around the hierarchy or suggested taxonomy of skills, point them to this post or, better yet, direct them to reach out to me or any of the authors with questions.

We will continue to work with and adapt the AI assessment scale, and we encourage others to do the same. Soon, we will release multiple versions of the scale for adaptation and transformation under their creative commons license.

Thank you, as always, for all of the conversations, examples, criticisms, and discussions around the AIAS. They have all contributed to our updates and continue to make the AIAS flexible and robust in helping educators deal with the implications of generative AI.

If you’d like to get in touch to discuss GenAI Consulting or anything at all related to artificial intelligence in education, including the AIAS, use the form below:

Want to learn more about GenAI professional development and advisory services, or just have questions or comments? Get in touch:

Go back

Your message has been sent

Warning
Warning
Warning
Warning
Warning.

3 responses to “AIAS: Why We’ve Driven Through the Traffic Lights”

  1. […] AIAS: Why We’ve Driven Through the Traffic Lights – an update to the blog that first suggested a traffic light system to AI and assessment guidance […]

  2. […] well, after I’d already introduced the scale to my classes, I discovered there is a more recent update to not only the levels on the scale, but the colour scheme. It’s also already been adapted more specifically for the EAP teaching context! I’ll […]

  3. […] of the reality of these technologies. I explained those decisions in full in my article, “Why We’ve Driven Through the Traffic Lights,” and we discuss it in our version two […]

Leave a Reply