3 Ways for Educators to Run Local AI and Why You Should Bother

I really want you to try local AI. Not because it’s useful, or practical, or even particularly good, but because I don’t want you to be caught out when models like Meta’s Llama catch up with GPT-4, and everyone suddenly has access to incredibly powerful AI for free.

And, given the trajectory of AI development, that will happen sooner rather than later. The margin between closed, proprietary AI (like OpenAI’s models, Anthropic’s, and Google’s biggest models) and “open source” is already fairly slim, and it’s getting slimmer.

Now is the time to experiment with these tools. In this blog post, I’ll show you three easy ways to set up local AI models in minutes, with very little technical knowledge required.

What is local AI?

“Local” AI runs on your device, with or without an internet connection. In this article, I’m referring specifically to local generative AI Large Language Models (LLMs) and will stick to a couple of specific models:

  • Llama, from Meta
  • Gemma, from Google

Sometimes, these models are referred to as “open source”, but that’s controversial since not all aspects of the models are open – we don’t know the exact details of the training data or methods, and the licenses are sometimes more restrictive.

I’m not too interested in this article with the semantics of “open source” versus “open weights”, so I’ll just refer to them as “local models” since the feature I’m focusing on is the ability to run them on-device and offline. This means that a local model needs to be something we can download and run on a laptop, phone, or other device.

Why use a local LLM?

There are a number of reasons to use a local LLM. Remember, the purpose of this article is simply to try them out and imagine what the future might look like. But a lot of companies and developers are starting to work with local AI because it is cheap, energy efficient, and private.

It costs nothing to download local AI models, and they do not require a subscription. Since they don’t use cloud storage or communication, they consume less energy than applications like ChatGPT: the energy cost comes from training the model, and then whatever it consumes while running on your device’s battery. And because they are not sending data off your device or using the internet, the data is private and only accessible by you.

The Practical AI Strategies online course is available now! Over 4 hours of content split into 10-20 minute lessons, covering 6 key areas of Generative AI. You’ll learn how GenAI works, how to prompt text, image, and other models, and the ethical implications of this complex technology. You will also learn how to adapt education and assessment practices to deal with GenAI. This course has been designed for K-12 and Higher Education, and is available now.

Method 1: GPT4All

GPT4All is an application developed by Nomic and designed to work with all local language models, including Meta’s Llama and the others we’ll be testing out. I’m starting with GPT4All because it is the most user-friendly application, and the easiest way to get started with a local LLM. It runs on Mac, Windows, or Linux, and depending on the size and sophistication of the model you should be able to use it on most relatively new devices.

To get started, head to nomic.AI/gpt4all and download the application for your device.

Once you have downloaded and installed GPT4All, open it up and head to the “models” tab. Choose a model (I’m downloading Meta’s Llama 3.2 8B model) and download the file. The size of the file will vary based on the “parameters” of the model. This 8 billion parameter model is about 4.7GB to download, and is a relatively powerful model: something like ChatGPT-3.5-and-a-bit.

Once the model is installed, head to either the home or the chats tab and start a new chat. Try anything you like, such as this obnoxious request to say hello using “complex code”:

GPT4All also has a range of other features, such as the ability to load your own documents in to use within a chat, much like attaching documents in ChatGPT or Claude. Play around and see what it can do.

Method 2: PocketPal on a Smartphone

There are a number of ways to use local LLMs on a phone. In fact, both Google and Apple are building new devices with AI on board. Apple Intelligence will be a feature of the iPhone 16 and 16 Pro, and new Google Pixel phones come with Google Gemini Nano on device. But these tiny language models are designed for specific purposes such as augmenting Siri.

Instead, I’m going to download a simple app which lets me choose which model to run: much like GPT4All but for my phone. There are many apps out there which will do this (and some snake oil merchants to watch out for), but I’m using Ashgar Ghorbani’s PocketPal on my iPhone 14 Pro because of its simplicity. Download it from the iOS App Store or wherever you get your apps.

Like GPT4All, you’ll need to download the model before you can use it. For a phone of my vintage, I’d recommend a smaller model: I downloaded the 3 billion parameter version of Llama 3.2. It’s not as capable as the 8B model, but it’s much less likely to melt my phone.

Don’t worry about the additional settings. Just dive straight in with a clichéd AI task like getting it to write terrible poetry.

The first thing you should notice is that it’s surprisingly fast. Llama 3.2:3B running on my 2022 iPhone runs just like the ChatGPT-3.5 of old (as in, November 2022). It’s about as competent at writing poetry as ChatGPT too…

Try it out with a bunch of tasks you’ve recently used ChatGPT for and compare the two. Even free users of OpenAI’s chatbot should be using GPT-4o by now, so that model will likely produce better output. But in many tasks, there’s not much difference.

Method 3: Ollama

Despite the name, Ollama has nothing to do with the Llama model. Like GPT4All it’s an open source application designed to run many different LLMs. I’ve left it until last because it’s the most technical, but even so it’s not difficult to get up and running.

Head to ollama.com and download the relevant version (note that the Windows version is currently in preview).

With the application installed and running, you’ll need to – guess what – download a model. To do this, you’ll need to use a command line tool like Terminal on a MacBook. I’m using a MacBook M3 Pro and as you can see I have already downloaded three models: Llama 3.2:1b, Llama 3.1:8b, and Gemma2:2b. To use Ollama from the command line you type ‘ollama [command]’.

To install a model, you need to run it for the first time with the command ‘ollama run [model name]’. For example, to install Gemma:

ollama run gemma2:2b

The first time you use the run command it will “pull” the model onto your device (download it) and from then on the same command will launch the model. As you can see in the screen shot above, you use it just like a chatbot by typing commands after the >>>. To quit, use ctrl+d or type “/bye”.

This is the smallest model I’m currently using.

Comparing the three methods

To compare the three methods, I’m going to repeat an exercise from last week’s post on differentiation. The prompt is as follows:

Using the concept of tiered instruction, create a three-tiered lesson plan for teaching the water cycle to a Year 7 Science class. For each tier, provide activities that address the same fundamental concepts but vary in complexity, scaffolding, and modes of expression. Ensure that higher tiers involve more abstract thinking and independent work, while lower tiers offer more concrete experiences and structured support.
Use the following tiering approach:
Tier 3 (top level): Critical thinking, where students evaluate and analyze. For each tier, provide activities that address the same fundamental concepts but vary in complexity, scaffolding, and modes of expression. Ensure higher tiers involve more abstract thinking and independent work, while lower tiers offer more concrete experiences and structured support
Tier 1 (base level): Basic learning and skills all students must know.
Tier 2 (middle level): Application of knowledge, where students manipulate information.

Here’s how the three models fare:

GPT4All (MacBook M3 Pro – Llama 3.2:8b)

PocketPal AI (iPhone 14 Pro – Llama 3.2:3b)

Ollama (MacBook M3 Pro – Gemma 2:2b)

Application outputs

This PDF contains all of the outputs from ChatGPT GPT-o1-preview, GPT4All (Llama 3.2:8b), PocketPal (Llama 3.2:3b), and Ollama (Gemma 2:2b). Download or read it here to directly compare the most powerful OpenAI model to the two Llama’s and Google’s tiny 1GB Gemma.

Here are my thoughts (noting that I’m an English teacher, not a Year 7 Science teacher):

  • GPT-o1-preview provided the most options, and was the most well-organised. This is unsurprising given it is designed to “think more slowly” than typical models, and to provide more stepped-out responses. It’s the most powerful model, and the only one that you have to pay to use.
  • Llama 3.2:8b (GPT4All) produced a ‘level 3’ task that I actually think is more interesting/unique than the options provided by GPT-o1-preview, but it only gave one suggestion for each tier. Interestingly, it’s ‘tier 2’ activity was the same as the GPT tier one diagram activity.
  • Llama 3.2:3b (PocketPal) gave essentially the same suggestions as the 8b parameter model, but in a slightly different order and with less detail. If it came down to a decision between the two, I’d probably choose the 3b for its small size and convenience (running a 2GB download on a phone versus the almost 5GB download on a laptop).
  • Gemma 2:2b’s (Ollama) suggestions were sort of a mashup of all the above, and the briefest in terms of detail – unsurprising given the tiny size of the model (<2GB to download). The suggestions weren’t bad, but leaned towards “gamified” activities and simulations. The research task for the third tier was pretty predictable (though that suggestion also came up in the very powerful GPT-o1-preview).

What Does This Mean for Education?

I want to reinforce that this post is not about converting you to an open source zealot, or recommending that you rush off and tell all of your students to download local AI. It’s a way to show what I see as an aspect of the future of AI. I strongly believe that small, local, on-device models will come to replace a lot of the functions of the big proprietary AI’s which are currently the norm.

So, I want to finish with a few questions to encourage you to think through the implications of local AI for education:

  • How might the availability of free, offline AI tools change the landscape of educational technology and equity in access to AI resources? What does this look like if only expensive, powerful devices can run the free AI?
  • What new challenges might arise in terms of academic integrity when students have unrestricted access to powerful AI assistants on their personal devices?
  • How could curriculum and assessment practices evolve to account for students’ constant access to AI tools, both online and offline?
  • What skills and literacies will become increasingly important for students to develop in a world where AI assistance is ubiquitous and easily accessible?
  • How might the role of teachers shift as AI tools become more sophisticated and readily available to students both in and out of the classroom?

Want to learn more about GenAI professional development and advisory services, or just have questions or comments? Get in touch:

Go back

Your message has been sent

Warning
Warning
Warning
Warning
Warning.

4 responses to “3 Ways for Educators to Run Local AI and Why You Should Bother”

  1. […] benefits of Large Language Models (LLMs) locally? Inspired by Leon Furze’s excellent post about running a local LLM, I decided to […]

  2. […] AI models leads to the first major implication for the near future of this technology: offline AI. A few weeks ago, I wrote a post explaining why I think educators should download and experiment with…. The most powerful at the time of writing is Meta’s Llama 3.2, which comes in a variety of […]

  3. […] written a couple of posts in the past about why I believe educators should experiment with open source AI, so I won’t go over that territory again. But I would recommend that you read this article, […]

  4. […] – want to try it out for yourself?  Leon Furze has an excellent primer to help you get started with running local AI today.  This is a fantastic way to get hands-on today 10mins or less – […]

Leave a Reply