Beyond Scales - In Practice

This is one half of a pair of articles dealing with the theory and the practice of moving “beyond scales” in AI and education. The two posts together demonstrate how our thoughts have developed since 2023, and the examples from practitioners around the world of work which has gone beyond the original AI Assessment Scale. The posts can be read in either order and the other half can be found here. This post deals with the practical examples.

Beneficial mutations

Dylan Wiliam’s concept of “lethal mutations” describes what happens when educational ideas lose their coherence as they scale: good principles, distorted by context, implementation pressure, and the sheer messiness of schools and universities. In the companion article I explore how the AI Assessment Scale has been subject to its share of lethal mutations over the past three years, why we couldn’t design away some of them if we tried, and why I suspect that whatever happens “beyond scales” may face the same fate.

Beyond Scales – In Theory

But not all mutations are lethal. A beneficial mutation is an adaptation that doesn’t just survive contact with reality but actually improves on what came before. Over the past few years I’ve had the privilege of seeing many institutions take the AIAS and reshape it into something more contextualised and demonstrably useful to the staff and student communities.

Examples from around the world

A quick online search surfaces many institutions experimenting with or sharing the AIAS and similar approaches, including University of Iowa, TU Dublin, Oxford Brookes, Flinders University, La Trobe University, University of Colorado Boulder, Northwest University, Saskatoon Public Schools, and many more. We don’t know how the scale or its variations have been adapted in most of those places, since we didn’t have a hand in their development. But we often hear directly from schools and universities sharing their practices through emails, articles, LinkedIn conversations, and professional connections.

What follows is a small sample of those adaptations.

Saskatoon Public Schools, Canada

In early 2025, Candace Elliott-Jensen, Coordinator of Secondary Learning Supports at Saskatoon Public Schools, began adapting the AIAS for her division. I’ve written a full case study on the Saskatoon adaptation elsewhere, so this is just a summary of the key decisions.

Her team’s most significant pedagogical decision was to move away from the scale’s hierarchical structure entirely – something we toyed with in version 2. Rather than presenting the categories as steps on a ladder, Saskatoon treats them as parallel options, each suited to different learning purposes. Their visual design deliberately mixes the icons rather than numbering them or arranging them sequentially.

Candace and her team worked directly with students, and they found that their framework helped move conversations away from the binary question “is AI allowed or not?” towards purpose, transparency, and student authorship. Saskatoon is now developing an elementary version with the same icons but simpler language, aiming for consistency across their K-12 system.

University of Queensland, Australia

Dr Olivia Wright, Director of Teaching and Learning at the School of Human Movement and Nutrition Sciences at the University of Queensland, has taken a different path with the scale. Dr Wright is leading the AI Learning Initiative via Engagement (ALIVE) project which is focused on developing and evaluating cross-disciplinary, agent-guided assessment experiences. Rather than using the AIAS as a set of rules to enforce, they have embedded it into their assessment design approach as a framework to guide staff thinking and shape how AI-mediated learning experiences are structured.

In practice, this means students engage with AI through guided, real-time interactions during assessment tasks, with AI agents designed to scaffold critical thinking, ethical decision-making, and reflection rather than generate answers. One aspect of particular interest is how Dr Wright’s team drew on the custom GPT tool to help design the prompts that shape student-AI interactions. They used it to generate structured prompt sets that underpin their Copilot agents, aligning agent behaviour with different levels of intention on the scale while creating a consistent, shareable approach across disciplines. In Dr Wright’s words, the team is “shifting toward a learning-first approach, where AI use is made visible and assessable through the interaction process itself.”

Early engagement from staff has been very positive, and the scale played a key role in building confidence and supporting more nuanced conversations about assessment design and student capability.

Medical University of South Carolina, United States

Dr Julaine Fowlin, Executive Director of the Center for the Advancement of Teaching and Learning at MUSC, led an adaptation that emerged from an AAC&U (Association of American Colleges and Universities) AI institute project. Like Saskatoon, her team deliberately moved away from numbering, recognising that a numerical scale implied hierarchy. They completely redesigned the framework as a cycle, renaming the categories to better fit their institutional context. These categories emerged from faculty feedback, and MUSC also adopted “task” instead of “assessment” terminology to recognise that AI use guidelines apply to a range of academic work beyond formal evaluations.

The adaptation was approved through three institutional councils and involved eleven named contributors across multiple colleges. Since the initial 2025 pilot with the College of Health Professions, the work has grown: MUSC created a dedicated Director of AI in Education and Scholarship role, and the team is now testing the framework’s utility with PhD biomedical and basic science graduate students. Dr Fowlin has also mentored Johns Hopkins in adopting their AI framework, extending the reach of MUSC’s adaptation beyond a single institution.

Berkeley College, United States

Jason Gulya, Professor of English and Communications at Berkeley College, has been writing publicly about AI and assessment since early 2024 through his Substack newsletter, The AI Edventure. His adaptation of the AIAS is notable because it turns the scale from a faculty-facing prescription into a student-facing reflective tool, embedded within a broader rethinking of how assessment itself works.

Gulya’s starting point was dissatisfaction with his own practice. In mid-2024, he described looking at his traditional essay assignments and feeling like a fraud; he had been writing about AI and education for two years but hadn’t actually reimagined his own assessments. That honesty led him to replace final papers with what he calls “AI-Powered Portfolios,” structured sequences where students freewrite, close-read, engage with a custom contrarian chatbot he designed, draft a short position statement, and then reflect on their process.

Students break their writing process into steps, self-report an AIAS level for each step, and defend those choices. Why was level-three use appropriate for getting feedback on a rough draft? What would have been lost by going to level four? The defending matters as much as the disclosing: Gulya asks students to consider whether each use of AI empowered their voice or gave it away. As he wrote in Faculty Focus, “the real power of the AIAS is as a student-facing tool. We can ask students to give the different parts of their process a rating, and then walk through why they gave them that rating.”

Over the following year, Gulya developed the model further. He developed what he calls SEWPs, or Self-Empowering Writing Processes, in which students design their own multi-step processes, implement them, and reflect on them. AI use is permitted but students must disclose it and defend it. He also moved towards alternative grading, piloting a hybrid of competency-based and specifications grading where students earn “Completes” against core skills rather than percentage grades. The connection between these is deliberate: Gulya argues that the transactional model of education, where students trade products for grades and grades for degrees, is one of the biggest obstacles to adapting to AI. If students are optimising for efficiency, they will offload thinking to AI uncritically. Changing the incentive structure is, in his view, a precondition for the transparency work to mean anything.

Gulya has also been candid about the limits of process-focused pedagogy. In a 2025 newsletter, he warned against what he sees as its four common pitfalls, the first being that “process over product” is, in a sense, a lie: you can never access a student’s process directly, only the products they create along the way. The danger is that a process-oriented course simply asks students to submit six products where they used to submit one, turning process pedagogy into surveillance. His answer is to design those intermediate products carefully, to allow for emergence and revision, and to avoid policing.

Holland Hall School (K-12)

Jane Beckwith, Director of Teaching and Learning at Holland Hall in Tulsa, Oklahoma, developed the AI Friction Scale after attending the National Humanities Center’s AI and Digital Literacy Teachers Institute at the University of Tulsa. The adaptation retains the AIAS’s five levels and descriptors but overlays them with the concept of “friction” or a productive cognitive resistance.

The visual design replaces the linear scale with a dial, reinforcing the idea that these are not steps to climb but settings to calibrate. More significantly, Beckwith expanded each level into a full student-facing resource with four sections: “Why?” (explaining the pedagogical rationale), “How to know that you are ready” (readiness indicators, presented as checkboxes), “What to watch out for” (common pitfalls at that level), and “What to aim for” (aspirational goals for students working at that friction point).

The original scale told students what they could do with AI at each level but not why the constraint existed, or how to recognise whether they were genuinely learning from it. Beckwith’s addition of “unproductive friction” as a concept, the kind of struggle that signals a student needs support rather than more persistence, is a particularly careful touch; it avoids the trap of treating all difficulty as inherently valuable. At level one, for instance, the scale warns that struggling to initiate or structure a task independently “is a sign of unproductive friction and should prompt a conversation with your instructor.” At level three, it cautions against using AI to mask a lack of fundamental understanding. At level five, it warns against getting lost in endless experimentation without a clear objective.

What we’ve learned

As I’ve said, I think version three already exists in the hands of any schools or universities that have taken version two and run with it. Some of these adaptations are better versions of the scale; others are beginning to outgrow it, asking questions about agency, transparency, and friction that the original framework wasn’t built for. Either way, these will be the things worth keeping.

First, that scales are insufficient to capture the complexity of AI, but AI is so complex educators and students need ways to contain it.

Saskatoon’s decision to treat the categories as parallel options rather than a progression shows them dancing with that complexity. How do you acknowledge that the many and varied potential applications of AI don’t match up with the ways that students and teachers are using it in the real world? Yes, theoretically, ambient AI is just around the corner. Theoretically, everybody will soon have a personal agent whispering sweet nothings in their ears. Theoretically, AI agents will soon become so complex that it’s a moot point whether they’re conscious or not, because it might feel like we’re working with real human individuals.

But right now, students are using AI for editing, proofreading, and summarising. Right now, teachers teach subjects where their instruction explicitly intersects with those skills. Right now, teachers are struggling with the fact that students are using AI poorly to offload those important areas. The scale, in Saskatoon’s hands, became a framework for having those grounded conversations.

Second, that the question of when and where AI fits into a curriculum is contextual and discipline-specific, but also requires consistency within an institution and a sector.

The work in assessment design at UQ shows educators coming to terms with this issue, customising an approach in specific disciplines to use the metalanguage and sequencing of an existing curriculum. Not the mundane question “what is a level two task?” but the infinitely more complex “how might a student use artificial intelligence as part of brainstorming, in this subject, on this topic, at this moment in time?” MUSC’s addition of documentation requirements across the framework is similarly contextual; what you ask a medical student to reflect on as they use AI is not the same as what you’d ask an engineering student or an arts student.

Third, that the most productive adaptations don’t just reposition the framework toward students; they reframe what it’s asking them to think about.

Gulya’s transparency statements turn the AIAS from a rule about what’s allowed into a reflective prompt: students describe their AI use at each stage of their process, then defend whether that use empowered their voice or diminished it. The question shifts from compliance to agency, and from “did you follow the rules” to “did you learn anything.” Beckwith goes further, replacing permission levels with cognitive friction and asking students to calibrate the productive resistance they need at each stage of a task. Her methods give students a way to self-assess whether they have the foundational understanding to benefit from AI at a given level, rather than simply being told they may or may not use it, and her concept of unproductive friction acknowledges that not all difficulty is worth persisting through.

And fourth, that no single framework can capture the complexities of these conversations.

No framework can adequately tick all of the boxes across subject areas, year levels, sectors, or international borders. It almost goes without saying, but I’ll say it anyway, that trying to copy and paste a framework from one jurisdiction into another is invariably going to lead to lethal mutations. But we have learned that it is possible to take a framework like the AIAS and, through professional conversations and teacher expertise, hone it into something contextualised and relevant. A lot of these “things we’ve learned” are obvious in hindsight, but I think it would be disingenuous to suggest that most people could predict in 2023 what this would look like in 2026.

The AIAS at this point is like the Ship of Theseus, or the broom that has been rebuilt from handle to bristles. And that is exactly how it should be.

What’s Next?

In the companion article, I described one version of “what’s next” from the position of the Australian higher education sector. But a broad scope across sectors and jurisdictions is also useful to make sure we don’t congeal around obvious-seeming answers like scales and lanes. Every framework and model for the past three years has contributed to the broader global discourse of AI in education, and we have much to learn from one another.

These adaptations are not the end of the story. They are, perhaps, a snapshot of a moment that is already passing. The technology that these frameworks were designed to address has moved on since each of them was written, and it will move on again before the ink dries on this article. We need to consider what happens when that movement accelerates: when wearable AI erodes the “separability and observability” that secure assessment currently depends on, and when the frameworks we build today face the same pressures that reshaped the AIAS over the past three years.

But that is the theoretical problem. The practical one, the one these educators have been solving in real time, is more modest and more urgent: how do we help the students in front of us, this semester, in this subject, use (or refuse) these technologies thoughtfully and well?

The examples above highlight that the answer is unlikely to come from a single framework. It will come from the educators who take whatever framework is to hand, understand its principles well enough to know which parts to keep and which to let go, and build something that fits the students and the context they actually have. That work is already happening. It has been happening for three years. And it will continue to happen all over the world.

If you’ve got examples to share or thoughts to add to this conversation, then please get in touch.

← Back

Dr Leon Furze