The AI Assessment Scale: Version 2

Back in April I published the first post on the AI Assessment Scale (AIAS). It emerged from a discussion with the teaching and learning team at Edith Cowan University. We felt that a binary yes/no to using Generative Artificial Intelligence (GenAI) would be unsustainable, and that we needed something with a more realistic approach to integrating the technology into assessments.

A lot has happened since April, and recently I have worked with Dr Mike Perkins, Dr Jasper Roe, and Dr Jason MacVaugh at British University Vietnam on updating the AIAS accordingly. This post is a reflection on those changes, and the more detailed account we have given in this article.

UPDATE: The AIAS has now been published in a peer reviewed journal, the Journal of University Teaching & Learning Practice (JUTLP). You can find the AIAS article in Vol. 21 Number 6.

Why do we need an AI Assessment Scale?

GenAI has caused a stir in education because it has revealed the fragility of many of our current assessments. The narrative of “cheating” that has accompanied the technology since the release of ChatGPT in November 2022 has been overpowering. But we believe that this narrative needs to shift, both to account for the advances in the technology, and to honour our students’ integrity and creativity.

We decide what is and what is not considered cheating. The AIAS is designed to make it easier to articulate those decisions. Also, it is designed to help educators articulate those decisions to students. It is not sufficient to say “the use of ChatGPT is considered cheating”. It’s also inaccurate to label the use of GenAI as plagiarism, as we have explained in the paper.

Detection tools do not work accurately enough for academic misconduct decisions, and so we cannot rely on post-assessment methods of reinforcing academic integrity.

The AIAS, then, comes before the assessment task and can be negotiated with students. It allows educators to very clear say how and why AI can or cannot be used in a given task. You can use the Scale to help students understand that in this assessment, it doesn’t matter if you use AI for idea generation or editing, but in this assessment we want you to demonstrate your own knowledge and skills unassisted.

The original AIAS

Though the original Scale came from a discussion with tertiary staff, it was grounded in my experience as a secondary English teacher. When I first conceptualised the five levels of the Scale, it was centred on a typical assessment like an essay or analytical response. The five levels were articulated as follows:

No AI: The assessment is completed under supervision, and/or handwritten, and/or under exam conditions. This level is suitable for testing knowledge and comprehension. For example, a traditional multiple-choice exam or an in-class essay written without the use of AI tools.
Brainstorming and ideas: AI can be used in the initial stages of the assessment for brainstorming and idea generation, checking ideas, etc. This level is suitable for assessments where students need to demonstrate their writing skills, such as constructing their own essays. For instance, students might use AI tools to help generate ideas for a persuasive essay or research paper.
Outlining and notes: AI can be used to outline entire responses or convert notes (or audio transcriptions) into organised ideas. This level is suitable for assessments where the focus is on the final quality of the writing, word choice, and expression. For example, students might use AI tools to turn their handwritten notes into a cohesive essay outline or restructure their notes to create a more logical flow of ideas.
Feedback and editing: AI can be used to provide feedback, self-assessment, or editing and revision. This level is ideal when the assessment focuses on the quality of the ideas and understanding, independent of the quality of language and expression. Students might use AI tools to receive instant feedback on their draft essays or identify areas of improvement in their writing.
Full AI: AI can be used to generate the entire output. This level is suitable when the outcome of the assessment is judged on the earlier organisation, idea generation, discussion, orals, and other methods. For example, students might use AI to create a comprehensive summary of a group discussion or synthesise research findings into a cohesive report.

I’ve taken this AIAS into many K-12 schools that I’ve worked with this year and have seen it adapted for many contexts beyond the English classroom. I’ve also discussed it at a tertiary level, which led to conversations with Mike, Jasper and Jason on how it might apply across other disciplines.

Updating the AIAS

Choosing to update the AIAS was really a response to the growing need for a clear and approachable means of addressing GenAI in education, both in K-12 and tertiary. We felt other models, such as the ‘traffic light’ approach, lacked the nuance needed to clearly articulate to students how AI can and cannot be used in certain tasks. After some discussion of the levels of the AIAS, we produced the following revised Scale:

The revised AIAS is designed to be flexible enough to account for a variety of disciplines and assessment forms, as well as allowing for the multimodality of generative AI. This means it can be applied to the use of image generation and editing tools just as easily as text generation like ChatGPT. It can also apply to tasks involving the creation of novel texts (such as writing essays or creating artworks) as well as tasks involving research, summarisation, and synthesis which may (or may not) use internet-connected GenAI tools.

Using the AIAS

In the full paper, we clearly outline each level of the AIAS and the kinds of tasks it may be applied to. Here’s a condensed version which explains the five levels of the updated AIAS:

Level 1: No AI

Students can’t use GenAI.
Suitable for assessments needing personal skills/knowledge.
Activities: technology-free discussions, in-class work, viva-voce exams.
Recommended for supervised or low-stakes assessments due to equity concerns.

Level 2: AI-Assisted Idea Generation and Structuring

GenAI used for brainstorming and structuring ideas, but final work must be human-authored.
Useful for idea development and foreign language classes.
Activities: collaborative brainstorming, creating structured outlines, research assistance.

Level 3: AI-Assisted Editing

Students use GenAI for refining and editing their work.
Beneficial for language improvements and multimodal content.
Activities: correcting grammar/spelling, suggesting synonyms, structural edits, visual editing.
Students submit original work alongside AI-assisted content for authenticity.

Level 4: AI Task Completion, Human Evaluation

Students actively use GenAI for specific task components, critically evaluate AI outputs.
Encourages understanding of GenAI’s capabilities and limitations.
Activities: direct AI generation, comparative analysis, critical evaluation, integrating AI content.
Flexibility in AI and human intelligence interaction.

Level 5: Full AI

AI used throughout the task at student/teacher discretion.
Suitable for tasks where GenAI is integral to learning outcomes.
Activities: co-creation, GenAI exploration, real-time feedback loops, creating GenAI products.
Encourages exploring GenAI as a collaborative and creative tool.

We also provide supplementary material in the article which includes examples of the AI Assessment Scale applied to a variety of disciplines from Business, to Creative Industries, Computing, and Hospitality. In the supplementary material the five levels are broken down into easy-to-understand tasks with clear advice on how generative AI may or may not be used, and what students need to do in order to reference and acknowledge their use.

Practical AI Strategies is available for pre-order from Amba Press

Introducing the AI Assessment Scale for Ethical GenAI Assessment

Here is the abstract from the paper, just released as a preprint:

Recent developments in Generative Artificial Intelligence (GenAI) have created a paradigm shift in multiple areas of society, and the use of these technologies is likely to become a defining feature of education in coming decades. GenAI offers transformative pedagogical opportunities, while simultaneously posing ethical and academic challenges. Against this backdrop, we outline a practical, simple, and sufficiently comprehensive tool to allow for the integration of GenAI tools into educational assessment: the AI Assessment Scale (AIAS). The AIAS empowers educators to select the appropriate level of GenAI usage in assessments based on the learning outcomes they seek to address. The AIAS offers greater clarity and transparency for students and educators, provides a fair and equitable policy tool for institutions to work with, and offers a nuanced approach which embraces the opportunities of GenAI while recognising that there are instances where such tools may not be pedagogically appropriate or necessary. By adopting a practical, flexible approach that can be implemented quickly, the AIAS can form a much-needed starting point to address the current uncertainty and anxiety regarding GenAI in education. As a secondary objective, we engage with the current literature and advocate for a refocused discourse on GenAI tools in education, one which foregrounds how technologies can help support and enhance teaching and learning, which contrasts with the current focus on GenAI as a facilitator of academic misconduct.
ARXIV PREPRINT

As you can see, we have focused on the ethics of generative AI use in a broad sense, as well as the specific concerns of academic conduct and honesty. Our approach is intended to help educators approach generative AI with an open mind and in a way that encourages authentic, secure, and rigorous assessment across disciplines.

If you’d like to get in touch to discuss generative artificial intelligence, please contact me using the form below:

3 responses to “The AI Assessment Scale: Version 2”

Sharing Diigo Links and Resources (weekly) | Another EducatorAl Blog

December 25, 2023 at 8:32 am

[…] The AI Assessment Scale: Version 2 – Leon Furze […]

Loading…

Deep dive with the AI Assessment Scale: Level 4 – Leon Furze

February 11, 2024 at 6:53 am

[…] is the fourth post in a series exploring the AI Assessment Scale (AIAS) in more detail. In the previous posts, I looked at Level 1-3. This time we head towards a balanced […]

Loading…

AI Detection in Education is a Dead End – Leon Furze

April 9, 2024 at 11:19 am

[…] scale with Dr. Mike Perkins, Dr. Jasper Roe and Associate Professor Jason MacVaugh. I’ve detailed the AI Assessment Scale elsewhere and you’re welcome to download a free ebook of activities aligned to the scales which allow […]

Loading…

Leon Furze