The Near Future of Generative Artificial Intelligence in Education: September 2024 Update

In my book Practical AI Strategies, released in January 2024, I dedicated the final chapter to the near future of generative artificial intelligence. I spoke about the convergence of multimodal technologies such as image, audio, and video generation, as well as improvements in automated coding, which would make it easier for people with no technical skills to create simple applications.

All of those January predictions have come true in the past six months, and so it’s time for an update and to look ahead at what the next 12 months will bring.

I don’t have a crystal ball, but it is possible to plot the near future trajectory of these technologies and examine how the commercial interests driving them forward will impact generative artificial intelligence, both within education and beyond. These are my next predictions for the near future of generative artificial intelligence.

Deepfakes and Synthetic Media

Earlier this year I wrote about deepfakes as a hot topic in the media and a concern for governments and legislators across the world. The proliferation of apps which use AI image-based technologies to create non-consensual nudes has led countries like the UK and Australia to pass new laws related to image-based abuse. We’ve already seen the impact of these technologies in classrooms overseas and here in Australia, and to their credit, governments are moving swiftly to hold developers and the social media platforms profiting from the advertising revenue of these apps accountable.

But whilst the term “deepfakes” covers the non-consensual creation of AI-generated media, there is another much broader term, “synthetic media,” which captures some of the legitimate uses of the technology. This is what I’ve been referring to as “digital plastic,” and I think it is one of the major next frontiers of generative artificial intelligence.

Back in January, I talked about the maturation of image, video, and audio generation, and how these technologies would converge upon lifelike AI avatars. That has now happened with platforms like HeyGen and Runway, already capable of producing increasingly believable AI-generated videos of people with accurate lip-syncing, and audio generation platforms like ElevenLabs, which can both clone real voices and generate convincing artificial ones.

This means that it’s possible to generate content on demand, including consensual deepfakes. For example, I could use an image of myself and an audio clone of my voice to create a synthetic avatar of myself that can read parts of this blog post, or I could design educational materials that combine real and synthetic footage to create engaging videos.

I recently wrote about Google’s experimental NotebookLM, which has a feature that takes all of the uploaded source documents (up to 200,000 words worth of material) and creates short, engaging podcasts with AI-generated hosts. These podcasts are incredibly believable, both in terms of the quality of the AI voices and in the manner in which these AI characters speak to one another.

Outside of education, we’re seeing platforms such as Character.AI and Replika racking up millions of hours of contact and pulling in huge amounts of investment from companies like Google. There will be positives and serious negatives as a result of synthetic media: Increased access to educational materials on one side, and a tsunami of slop or low-grade digital plastic on the other; AI avatars which can offer legitimate advice and support, balanced out by highly addictive chatbots which target lonely people and are designed to replace human relationships.

Legislating against non-consensual explicit deepfakes or image-based abuse is one important step in mitigating the risks, but it does not go far enough in understanding the implications of these technologies. To push these conversations forwards, we’ve recently published a research agenda for deepfakes in higher education and are currently conducting a survey for all people in higher education for perspectives on deepfakes and synthetic media.

Pseudo-Reasoning

The recent release of OpenAI’s latest model, GPT-o1 or “Strawberry” as it was nicknamed, has sparked a furious debate over whether large language models can reason and plan. Up until now, AI experts like Yann LeCun have labelled reasoning and planning as the next big frontier of generative artificial intelligence. Academic Gary Marcus agrees (a miracle in itself, since LeCun and Marcus generally argue about these things) and suggests that some form of neurosymbolic AI might be needed to bridge the gap between large language models and actual artificial general intelligence.

But OpenAI’s latest model claims to have addressed these issues. Many people, myself included, are skeptical of the capabilities of GPT-o1, currently only available in preview mode, since essentially all the model does is iterate through chain-of-thought prompting. While this method of breaking down tasks into separate steps for an LLM to approach one at a time has been noted to improve results on complex tasks, it is not reasoning, it is not cognition, and it is not the way humans think. So for now, I’m calling it “pseudo-reasoning.”

I think that this will continue to be a focus in the near future of generative AI. We will see some significant advances in pseudo-reasoning, which may contribute to models which are more suited to complex processes such as mathematics and scientific applications. In my own experiments, I found GPT-o1 preview highly competent for structured tasks such as building simple software applications, but worse than the preceding model, GPT-4o, or competing models like Claude 3.5 Sonnet at complex language-based tasks like discourse analysis.

Finding appropriate uses for pseudo-reasoning and more complex chain-of-thought models will be an important piece of the puzzle for the next 12 months.

General to Specific

I mentioned Google’s Notebook LM earlier, and it’s worth revisiting as an example of my next prediction for the near future of language models. Since the release of ChatGPT in November 2022, we have seen a slew of technologies built on top of large language models. These typically take the form of a website constructed on top of and connected to the API of a large language model like OpenAI’s GPT. In many ways, however, these platforms are just as broad and general as the language models they are built on top of.

Some, like EdTech platform Magic School, claim to refine the output of the model and produce more contextualised and education-appropriate materials, but if you scratch below the surface, you won’t find much that you can’t achieve just by using ChatGPT itself.

Similarly, there is a push from governments and technology companies across the world to digitise national curricula and build chatbots which sit on top of them for educators and students to interact with. I’ve written about why I think this is a bad idea and the hopelessness of the technology. Again, there is little being done here which can’t be achieved directly with a big, general model. It doesn’t shift the dial on the technology: it doesn’t make it any more useful for educators, except for perhaps reducing the need for a skilled and professional understanding of the curriculum materials that back-end these applications.

A much more interesting example is what applications like Google’s Notebook LM suggest about the future of generative AI in education. Rather than trying to do everything or pigeonhole the language model (Gemini, in this case) into a set of pre-built templates, Notebook LM seems to straddle both the highly capable skills of a large language model, such as synthesis and inference, with a much more targeted, specific use case. In this case, an AI research notebook.

At this stage, it’s only an experimental product, but it isn’t trying to be anything else. Unlike Magic School and Khanmigo, it doesn’t offer educators hundreds of buttons to click and templates to generate. Unlike Oak Academy, it doesn’t try to cram a curriculum into pre-formatted PDFs with a predetermined pedagogical approach. It simply says, “Put your documents in here, and I’ll help you to organise them and take notes.”

Another example comes from Microsoft, though it’s an application which has been touted more for business use than in education. Copilot in Teams allows users to transcribe, analyse minutes, and assign and ascribe actions to live transcripts of Teams meetings specifically.

We Will Be Caught Off-Guard

As well as these three near-term trajectories, it’s worth reflecting on all of the things which have caught us off-guard in the past few months. The developers behind these technologies are working in a highly competitive environment, and a lot of money (billions of dollars) is being spent on churning out endless products and advances.

OpenAI teased “strawberry” for a while before releasing GPT-o1 (in a series of mostly obnoxious tweets and obscure social media posts). Speculation about advanced reasoning and cognition flooded forums on Reddit and elsewhere in the weeks before its release. But everyone seemed to be caught off guard by the length and complexity of the responses. Some users have reported ChatGPT “thinks” for over 40 minutes, writing a huge, rambling response to a seemingly simple prompt.

In a similar vein, both Anthropic’s Claude and Google’s Gemini surprised people with enormous context lengths for their products: Claude can handle thousands of words of text in a chat or through its Projects, and Gemini Advanced can work with a million tokens. So, while these advances in the size of input and output took us by surprise in 2024, they should be considered inevitable going forwards.

What else took us by surprise, and what can we learn?

Well, we’ve had plenty of shiny tech demos from the major players. From OpenAI improved mathematical reasoning, advanced voice mode features (which still haven’t been fully released), and February 2024’s Sora video model spring to mind: multimodal, cross-disciplinary AI is the clear near future of products from that company.

Both Google’s Project Astra and Apple’s new and improved Siri point the way towards AI powered voice assistants, and will also require advances in another “near future” trend: small, on-device language models.

If nothing else, you can guarantee the hype machine will keep on churning: every step forwards will be hailed as bringing us an inch closer to Artificial General Intelligence. Every update from OpenAI will be met with equally loud choruses of praise and condemnation. Ethicists and environmentalists will continue to raise red flags just as quickly as Microsoft, Google, Meta, Amazon, Apple, and a host of other startups and developers release wave after wave of new products.

But whatever else happens with Generative AI, you can guarantee we will be surprised.

Want to learn more about GenAI professional development and advisory services, or just have questions or comments? Get in touch:

← Back

Thank you for your response. ✨

4 responses to “The Near Future of Generative Artificial Intelligence in Education: September 2024 Update”

  1. […] The Near Future of Generative Artificial Intelligence in Education: September 2024 Update – Leon F… […]

  2. […] The Near Future of Generative Artificial Intelligence in Education: September 2024 Update […]

  3. […] The Near Future of Generative Artificial Intelligence in Education: September 2024 Update […]

  4. […] and outlined how my thinking had changed in the six-months since publishing the book. Then, in September and November 2024, in a two-part article, I suggested six trends to focus on: deepfakes, reasoning […]

Leave a Reply