Rethinking Assessment for Generative AI: Beyond the Essay

person holding blue ballpoint pen writing in notebook

This post is part of a series on rethinking assessment in light of generative AI. The posts draw on research and resources from K-12 and tertiary to suggest ways that educators can design engaging, compelling assessments which shift the narrative away from GAI and “cheating”. Click here for the previous post on oral assessments.

There are plenty of good reasons to use the essay as a formal piece of assessment. Essays offer an opportunity for students to demonstrate their knowledge in a cohesive, structured manner, forming logical arguments and leading the reader through their thoughts. They can also be creative and playful, allowing a confident writer to express their unique authorial voice.

Essays are also relatively easy to grade, and scaleable. It’s straightforward enough to collect and mark essays conducted under examination settings, even from thousands of students. Every year, the Victorian Curriculum and Assessment Authority (VCAA) collects around 45,000 English essays which are scanned into Pearson’s eMark platform and distributed to a few hundred assessors. It’s a huge task, but not unreasonable, and the essays are accurately graded in a two week period.

Unfortunately the expediency of essays as a form of assessment has made the form a “go to” in many subject areas and at various levels from primary through to tertiary. Don’t get me wrong, I personally love writing. I’d take an essay over a multiple choice exam any day, and I’d probably prefer to write than undertake an oral exam. But the problem is, not everyone enjoys writing, and not everyone can write an extended response.

That would be fine, if essays weren’t so prominent. And now that an easily accessed technology exists that can do most of the work of essay writing, we have an even more complex problem. Because now if students don’t want to write, can’t find the time to write, or simply can’t write, they can turn to a chatbot like ChatGPT to do the work for them, and that’s obviously not what we want.

On November 8th I’ll be running a webinar on how educators can use image generation in their day-to-day work. Check it out on eventbrite.

The answer, however, doesn’t lie in banning the technology or locking down all essays to examination conditions. Exams are terrible for accessibility, cause anxiety, and are far from a genuine representation of most skills. Instead, we need to look for ways to move beyond the essay.

In this post I’m going to explore a few alternatives which still tick the boxes of what could be assessed via an essay. I’m not discussing oral assessments here as I covered that in detail in an earlier post, but those are of course worth bearing in mind too when looking for alternatives.

Beyond the essay

Aside from the “threat” of GAI chatbots, here are a few more reasons why you might want to look beyond an essay or long written response as a primary form of assessment:

  • Limited Scope: Essays often focus on individual performance and intellectual capability, which may overlook other important skills such as teamwork, verbal communication, or practical skills.
  • Time-Consuming: Writing, revising, and grading essays can be time-consuming for both students and educators.
  • Writing Skill Bias: The format may unfairly favour individuals with strong writing skills, even if the subject matter does not primarily concern writing ability.
  • Plagiarism Risk: Essays can be susceptible to plagiarism, which undermines the learning process and the integrity of the assessment.
  • Stress Inducing: Some individuals might find essay writing to be stressful or anxiety-inducing, especially if they struggle with articulating their thoughts in writing.
  • Fixed Interpretations: Essays often require adherence to a particular thesis or argument, which may discourage the exploration of alternative perspectives or creative thinking.
  • Cultural Bias: The conventional essay format may reflect a particular cultural or academic tradition that might not be inclusive of or accessible to all learners.
  • Subjectivity in Grading: Grading essays can be subjective, and variations in grading standards can affect fairness and consistency in assessment.
  • Limited Feedback: In large classes, providing detailed, constructive feedback on essays can be challenging, which may hinder the learning process.
  • Misalignment with Learning Objectives: If the primary learning objectives are to assess practical skills, collaborative abilities, or other non-writing related competencies, essays may not be the best assessment tool.

The alternatives

So let’s take a look at a few alternatives which directly address those concerns, and which might also mitigate the risk of students breaching academic integrity by using GAI in unacceptable ways. For each of the assessment types I’ll provide an overview and then some examples. Since I haven’t taught in every single subject area, some of the examples will be generated by GPT-4.

Performance based

If you’ve ever taught in the VET sector, Australia’s vocational training, you’ll find many of the assessment types in this post familiar. That’s because performance based tasks, observations, and on-the-job skills are par for the course in vocational education. But there’s no reason this can’t transfer to any secondary-tertiary subject.

After all, any discipline should be preparing students to use the skills and knowledge in some real-world context, whether that’s further study, employment, or another field entirely. Even esoteric subjects like philosophy or subjects in the creative arts where the “goal” isn’t necessarily tied to economic imperatives require students to develop skills they’ll actually use in the future. For core subjects in secondary – Maths, English, Science, and the Humanities – the skills and knowledge are far better related to real-world applications that abstract chunks of knowledge.

Performance based assessments may require students to respond to essential questions and demonstrate skills in a real-world scenario. They allow for interdisciplinary knowledge and don’t necessarily rely on the content taught in a given unit or topic.

Here are a few examples:

  1. Mathematics: Students could be tasked with designing a budget plan for a small startup, applying mathematical principles to allocate resources, project profits, and manage expenses. The final presentation could include a detailed report and a presentation to a mock panel of investors.
  2. English: Students might be asked to create a multimedia storytelling project where they write and illustrate a short story, then present it to a younger age group at a local library or school. This task encourages creative writing, visual storytelling, and public speaking skills.
  3. Physical Education: Create a fitness programme for a specific goal such as preparing for a 5k run or improving general health. Students could track their progress, reflect on their experiences, and present their results and learnings to the class.
  4. Economics: Students could be assigned to analyse the economic impact of a recent local or global event, using economic theories and models. They could present their findings in a video essay format to be shared with the community, encouraging real-world application and public discourse.
  5. Italian (LOTE): Students could be tasked with planning and executing an “Italian Culture Day” event, where they prepare Italian dishes, present on various aspects of Italian culture, and engage in conversations in Italian. This task encourages language use in a practical, engaging, and collaborative context.

Enjoying these posts? Consider joining the mailing list for updates:

Processing…
Success! You're on the list.

Portfolio or writing journal

I love writing journal tasks. They’re my recommended form of assessment for VCE English and EAL Unit 2 and 3: Crafting and Creating Texts, and pretty much any form of creative assessment where students are required to demonstrate skill development over time. Portfolios of artwork, design ideas, and creative writing are common, but this assessment approach can be applied to other disciplines and subjects.

When I go through my own notes for my PhD colloquium document, it looks very much like a writing journal. I have annotations, extended abstracts, short snippets of writing which may or may not make it to the final piece, and draft versions of the document itself. Each piece contributes to the whole of my knowledge on the subject. The problem is, in secondary and tertiary education, we often don’t value the whole journey.

Obviously my situation is different: if you’re studying at this level, you have to really want to. It’s not like secondary English, where you’re doing it because it’s compulsory, or even undergraduate studies where you might be doing it just to get a job or because it’s expected. But if both students and educators at every level could learn to value the process of writing and creating, then we would probably find there’s a lot more valid content to assess than just the finished product. The added bonus is you’ll get a lot more insight into the student’s usual style, voice, and way of thinking, which can be useful in academic integrity conversations.

Here are a few examples outside of the usual subject areas of visual arts and English:

  1. Physics: Research Journal: Students could maintain a journal documenting their investigations into various physics phenomena. They could conduct small experiments, record observations, analyse data, and draw conclusions over the course of the term or semester. Additionally, they might reflect on how their understanding evolves with each experiment and how the concepts relate to real-world applications.
  2. Business Management: Case Study Portfolio: Students could create a portfolio of case studies analysing different businesses or management scenarios. For each case, they could provide an overview, identify challenges, propose solutions based on management theories, and reflect on the potential outcomes and lessons learned. This portfolio could showcase their analytical, problem-solving, and strategic thinking skills.
  3. Geography: Field Study Journal: Students could document field studies investigating local geographical issues or features. They could record observations, collect data, and analyse findings in a journal, reflecting on the implications and how the local findings connect to broader geographical concepts.
  4. Psychology: Observational Journal: Students could maintain a journal where they observe and analyse human behaviour in various settings, relating their observations to psychological theories and concepts. They might also reflect on how these observations alter or deepen their understanding of psychological principles.
  5. History: Historical Investigation Portfolio: Students could conduct investigations into different historical events or figures, documenting their research process, sources, analyses, and reflections in a portfolio. They might also include essays or reports that synthesise their findings, showcasing their ability to engage with historical inquiry and analysis.

Remember, with this type of assessment there’s no need to have a “finished product” at the end. Students can of course take parts of their work over the term and write a final essay, but they don’t have to.

Project based

Project-based learning (PBL) is nothing new. Some schools and education institutions have entire curricula based around PBL, dedicated middle-years programs, or whole sites devoted to the format. You don’t have to go all-in on PBL though to get some of the benefits of project based assessment.

In a typical project based assessment, there is a real-world problem and a structure like a design thinking process. There may also be an inquiry problem or research topic, and the project extends over a number of weeks or even a whole term or semester. At the end, there is often an opportunity to present or pitch an idea or solution to the problem.

As a means to rethink assessment in light of generative AI, project based tasks could be a great option as they are engaging, authentic, and allow a student to demonstrate their skills in a broad range of tasks rather than a pass/fail scenario. Students “cheat” in assessments for all kinds of reasons, but making tasks more engaging can mitigate some of the risk.

Here are some examples:

  1. Computer Science: Students could be tasked with developing a mobile or web application to solve a real-world problem. They would need to go through the stages of planning, design, coding, testing, and deployment, and finally present their application and a report of their process, challenges faced, and how they overcame them.
  2. Literature: Students could create a literary magazine featuring original short stories, poems, and essays, along with literary analysis of classic or contemporary works. They would be responsible for the curation, editing, design, and publication of the magazine, either in print or digitally.
  3. Music: Students could be tasked with composing an original musical piece based on a particular theme or historical period. They would then perform the piece, either solo or as part of a group, and submit a reflection on their creative process and the techniques used in their composition.
  4. Product Design and Technology: Students could identify a common problem and design a product to address it. They would then create a prototype, document the design process, gather feedback, make improvements, and present their final design along with a reflection on the iterative design process.
  5. Environmental Science: Students could conduct a study on a local environmental issue, such as pollution, wildlife habitat destruction, or energy consumption. They would collect and analyse data, propose solutions, and present their findings to the community or a local governmental body.
Study of a local environmental issue, illustration, isometric. Model: Midjourney

Observations

As I mentioned earlier, the vocational sector has a lot to offer when it comes to real-world, non-essay based assessment tasks. Many of these kinds of tasks are “GAI-proof” because they happen away from devices under practical circumstances.

Observations can be conducted in a range of scenarios, including performance tasks, as part of longer projects, and during group work. The difference is the student likely knows that the observed period of time is their assessment, and is (hopefully) aware of the explicit criteria.

That can add some pressure to the task, but only the kind of pressure that the student is likely to face in a real-world scenario, and not the false pressure of an examination. 

In VET subjects, an assessment tool is a framework for evaluating students’ knowledge and skills, comprising assessment context, tasks, evidence gathering guidelines, performance quality criteria, and administrative requirements. These tools, guided by principles of validity, reliability, flexibility, and fairness, ensure that assessments are accurate, consistent, negotiable, and equitable.

The design of these tools necessitates industry consultation and testing on a student sample to ensure the evidence collected is valid, sufficient, current, and authentic, aligning with the competency units’ criteria.

This includes the design of assessment tools for observations. Tools like observation checklists, accompanying questions, and instructions for both students and lecturers/observers support this method. Observation checklists a focus on vocational and employability skills, and adherence to workplace procedures. 

Here are some examples of observation based tasks across different disciplines:

  1. Mathematics:
    • Task: Solving a series of progressively complex algebraic equations.
    • Checklist: Correct application of algebraic rules, accurate simplification, correct answer, and clear presentation of solution steps.
  2. Science (Physics):
    • Task: Conducting a physics experiment to measure the acceleration due to gravity.
    • Checklist: Correct setup of equipment, accurate measurement collection, proper calculation of acceleration, and thorough documentation of the process and results.
  3. Health and Physical Education:
    • Task: Demonstrating a series of gymnastic routines.
    • Checklist: Correct form and technique, smooth transitions between movements, adherence to safety guidelines, and completion of all required routines.
  4. Product Design and Technology (Food):
    • Task: Preparing a three-course meal adhering to nutritional guidelines.
    • Checklist: Proper hygiene practices, correct measuring and mixing, adherence to recipe instructions, presentation of the final dishes, and nutritional balance.
  5. Digital Technology (Computer Programming):
    • Task: Coding a simple game using a programming language like Python or Java.
    • Checklist: Correct syntax, efficient code structure, functionality of the game, debugging and troubleshooting skills, and user interface design.

Visual essays

The final type is an essay… of a sort. The University of Hertfordshire calls a visual essay “a critical commentary”, which I think is a perfect description of many kinds of essay, including traditional written ones. We want students to be able to give a critical, personal, and insightful commentary on their topic, whatever the form. 

A visual essay is a curated series of images, either original or significantly processed (including using GAI), that together provide critical commentary on a specific topic, functioning as a form of argument or discussion. The sequence and layout of images, accompanied by captions or integrated text, guide the ‘reading’ of the essay. Despite seeming less demanding, creating a visual essay requires effort comparable to traditional academic writing. 

A visual essay might be be presented as a bound sequence, a series of unbound cards, or something like a PowerPoint slideshow, with the design and communication being crucial for its success. Like a traditional essay, it requires thorough research, organisation, and referencing, with an annotated bibliography using a referencing system. 

The amount of images and text should correspond to the effort needed for a written essay of a particular word count, for example, 10-12 images with 500-700 words of text for a 1500 word essay equivalence. The annotated bibliography should detail the usefulness and application of each source in the visual essay. 

Here are some examples:

  1. Historical Events:
    • Topic: “The Evolution of Fashion: A Visual Journey Through the 20th Century.”
    • Description: This visual essay could depict the evolution of fashion throughout the 20th century, showcasing iconic styles from each decade alongside historical contexts that influenced these fashion trends.
  2. Environmental Science:
    • Topic: “The Impact of Plastic Waste on Marine Life.”
    • Description: A visual essay displaying the consequences of plastic pollution in oceans and seas, with images showcasing affected marine life, polluted areas, and comparisons of clean versus polluted waters.
  3. Social Issues:
    • Topic: “The Faces of Homelessness: A Glimpse into Life on the Streets.”
    • Description: This essay could present a series of portraits and living conditions of homeless individuals, aiming to humanise and shed light on the issue of homelessness.
  4. Technology:
    • Topic: “The Rise of Smartphones: Transforming Modern Communication.”
    • Description: A visual essay illustrating the evolution of smartphones, their impact on communication, social interaction, and the juxtaposition of traditional versus digital communication methods.
  5. Health and Wellbeing:
    • Topic: “The Mental Health Impact of Pandemic Lockdowns.”
    • Description: This essay could visually represent personal narratives, statistics, and scenes from daily life during lockdown, highlighting the mental health challenges faced by individuals.
A Renaissance artist included in a visual essay on “immersion” and visual layout.
Van Leeuwen, T. (2007). Sound and vision. Visual Communication, 6(2), 136–145. https://doi.org/10.1177/1470357207077443

If you’d like to chat about GAI and assessment, or if you have comments, questions, or something to add, get in touch using the form below:

One response to “Rethinking Assessment for Generative AI: Beyond the Essay”

  1. […] how far a student has come, and where they need to go next. Tasks should be meaningful and have real-world applicability. Level 1 tasks, then, should focus on developing and demonstrating the skills where AI simply […]

Leave a Reply

Discover more from Leon Furze

Subscribe now to keep reading and get access to the full archive.

Continue reading