Science Fiction Worldbuilding with Generative AI

It’s been a while since I made a creative post highlighting some of the creative multimodal capabilities of GenAI. It can be easy to get swept up in the drama of recent events like OpenAI’s board firing its CEO (and his subsequent possible flight to Microsoft, or return to OpenAI…). It’s also easy to focus on the darker side of the technology, including the many ethical issues. But it’s also important to acknowledge that these tools offer real benefits to creators and learners.

In this post, I’m going to go step-by-step through a process of “worldbuilding” with generative AI. I’m using ChatGPT Plus (GPT-4) since it has the multimodal capabilities of text and image generation, code interpreter, image recognition, and the ability to browse. You could also follow a similar process with Microsoft’s free Bing Chat and image creator, and a handful of other multimodal tools.

I’ll go through the process in detail, and there’s a 5 minute video at the end explaining the whole thing.

Creating a World

I think it’s always best to start any interaction with generative AI with a “job” in mind. I don’t personally use any particular framework for prompting, and I don’t consider myself a “prompt engineer”: I just know what I want, and try to work out a way to get there.

In this instance, I wanted to see if multimodal GenAI could be a useful tool for science fiction worldbuilding: creating plausible settings for short or long sf stories, perhaps in the future, maybe on other planets. Worldbuilding is an art in itself. The best science fiction novels create worlds that blend reality and fantasy, often in ways which are based on plausible or existing technologies.

I used a simple, straightforward prompt to begin the process:

We’re going to do some world building for a science fiction fantasy world. What I’d like you do to is help with things like the physics, the physical description of the environment, and some visualisations. First, I’II provide some advice about how to do science fiction world building from a suitable website. Ready?
ChatGPT (GPT-4) prompt

Next, I decided to anchor the worldbuilding in some guidelines from a blog post I’d found online, using the Bing browsing feature of GPT-4. I sent it off to an article at Dabble Writer and asked it to generate some key questions and considerations for our worldbuilding process. Here’s what it came up with:

Enjoying this post? Join the list for more:

Processing…

Success! You're on the list.

Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.

Ordinarily, I’d go through the process of answering these questions myself. But since this is just a demonstration of the technology, I figured I’d skip a step and get ChatGPT to generate its own answers. Here are the answers:

I was pretty happy with what it came up with, so I then prompted for some more information on the physical world. Again, I decided to anchor the next response with some real-world information, so I sent ChatGPT off to the Encyclopaedia of Exoplanets website. Unfortunately, that was a little too broad and it returned a generic response:

To get it to give me something a little more specific, I went to the site myself and clicked a random exoplanet link from the catalogue. I then passed that link along to ChatGPT and asked it to refine its response:

Visualising the Exoplanet

Before going any further I decided it would be good to take a look at our semi-fictional science fiction setting. I added a “fictional habitable moon” and some “exotic energy” to the scenario, and then requested an image. This is what ChatGPT generated:

Exoplanet XO-7 b with its fictional moons, as generated by ChatGPT (GPT-4)

Since we’re dealing with science fiction, it’s also good to throw a bit of actual science into the mix. Leveraging ChatGPT’s code interpreter capabilities, I asked for a couple of physics experiments: first, something to show off our exotic energy source, second, to model the physics on the habitable moon.

Back to the browsing functionality for a second, and I now added in some contextual information on the planet Earth in 2055. I used information from Future Timeline to ground the model’s response:

And then generated an image of Earth 2055 based on that information:

Finally, I wanted to see if ChatGPT could turn this information into something a little more interactive, so I asked for a low-code browser based game of its design. What it gave me wasn’t exactly a fun game, but it worked for me as a quick demonstration. ChatGPT created a clickable button which increases the levels of the exotic energy, and accordingly decreases the “safety” of the environment:

Crewing the Space Mission

Time to add some humans into the world we’re building. Now that I have two settings – the exoplanet and Earth 2055 – I decided to generate some images of possible characters for the story. Again, I’m going for a quick demonstration here rather than full-blown worldbuilding, so I’ll just take what I can get directly from ChatGPT.

Unfortunately, the first image fell right into the AI stereotype trap and gave me a chisel-jawed white man as my astronaut:

After a few more rounds of prompting, I had a slightly more diverse crew:

Mission Control

So far, I’ve demonstrated quite a few of the features of multimodal GenAI without leaving the platform of ChatGPT:

Text generation for the initial planning
Web browsing for the questions, exoplanet info, and 2055 timeline
Image generation for the planet, Earth, and astronauts
Code generation and execution for the physics experiments

I realised I’d missed image recognition, so thought I’d try a variation on something I’ve done before: turning a sketch into a website. At this point, though, things started to fall apart. I’ve edited most of this out of the final video, but I had some seriously frustrating moments trying to get ChatGPT to stay on task. In fact, I think I hit every single one of the most common issues when trying to get ChatGPT to perform complex tasks:

Timing out when running long code
Issues with lengthy text and long filenames in code interpreter
Generating things which just don’t work
Saying it can’t do something which it can do (like write functioning HTML)
Random red error messages

Here’s the bloopers reel:

Eventually, I got ChatGPT to write a summary of the entire thread and some text for our website, then copy/pasted that into a new chat and started from scratch. I’ve found that sometimes ChatGPT needs a bit of a palette cleanser as threads get lengthy and start to go beyond the context window – the amount of text the model can process in one go.

Things started to go a lot more smoothly, beginning with the sketch to website design efforts:

Mostly accurate; however, it has fabricated a second sidebar which will turn up in its attempts to build the website later. I want to break down the prompt in more detail because I used a few “tactics” to try to make the process more effective:

We are designing a website as the basis for an extra solar mission. The website will include images from the mission, text I will provide, and profiles of the astronauts. We will do this step by step. I’ll provide all of the images, starting with one for the layout. You won’t be able to build this in one go so you’ll have to break it down into stages and go through step by step, for example creating the CSS and html files in different steps, then putting them all together in a zip file for me. The first step is to understand the layout. I have attached a sketch.
Prompt for ChatGPT (GPT-4)

I’ve given the scope of the job, and also some specific requests such as “break it down into stages and go through step by step”. This is because longer tasks are much more prone to error, especially when using code interpreter. Code interpreter also runs out of steam after about 60 seconds, so any code that is too long to write and execute will time out.

Following my instructions, ChatGPT stepped out its process and began with the HTML shell for the website structure:

Next, it developed the CSS style sheets and the html code. I instructed it to use stock images from Unsplash in the first test prior to uploading the actual images:

Finally, I uploaded the actual image files and the text from the previous chat to fill the website, along with the instruction to include three astronaut profiles (not one, as in the image above):

After some back-and-forth where I hit a few more “timeout” issues, i ended up asking for the HTML code in three sections, and manually copy/pasted them into a blank text file. Again, this is something I’ve done in the past to avoid the timeout issues and I’ve found it to be generally effective if the code isn’t too long.

Eventually, and honestly a little to my surprise, I ended up with something approaching the website I was imagining:

It’s more obvious in the video, but the exoplanet image is in the background and has the requested “parallax scroll” effect (the thing where you scroll a website but the background image stays put). The image of Earth came in underneath and was a bit awkwardly sized, but at this stage I decided to call it quits.

Summary and video

OK so back to the earlier summary, but with a few additions. Throughout this entire process, I’ve done some science fiction worldbuilding but more importantly I’ve used all of the multimodal tools within ChatGPT. As I said at the beginning, you could do many of these things for free in other platforms like Microsoft Bing Chat. Here’s the full summary:

Text generation for the initial planning
Web browsing for the questions, exoplanet info, and 2055 timeline
Image generation for the planet, Earth, and astronauts
Code generation and execution for the physics experiments
Image recognition to “read” the website sketch
HTML, CSS, and Javascript generation and creation of downloadable HTML files in a .zip folder
Creation of a functioning website using text and images generated by the above processes

Not bad, really. here’s the video explaining the process from start to finish:

If you’ve enjoyed this post and you’d like to discuss generative AI, professional learning, or my work on AI policies and guidelines, then please get in touch using the form below:

Leon Furze