Digital plastic: Generative AI and the digital ecosystem

I’ve got a love hate relationship with the metaphors we use to talk about Artificial Intelligence, but at the risk of seeming hypocritical I’m going to make one anyway. I’m also going to mangle several fields of science, geology, and archaeology in the process, but hey, it’s all in the pursuit of The Metaphor.

I want you to imagine a digital archaeologist, some time in the distant future, digging through the strata of online data. Maybe it’s a person you’re imagining – probably an academic with too much time on their hands now that we’ve all been freed from the burden of having to do any real work.

More likely, it’s a bit of archival software, some AI bot trained to sift through the decades of content all the way back to 1993, and the release of the internet into the public domain.

Whatever your digital archaeologist looks like, imagine them standing on top of the pile of data amassed online since the 1990s, and trying to make some sense of the society that created it.

From the 1990s to the 2000s, there’s a thick layer of sediment built up of bulletin boards, news sites, and company websites. There’s a distinct red-white-and-blue colouration, with most of the data in the stratum coming from the US.

By the mid 2000s-2010s, the gradient shifts as more global users come on board and the first social media websites stat to take off, with Facebook’s official launch in 2004, YouTube in 2005, and Instagram in 2010. The layers start to get a little wild as internet speeds increase and more and more photo and video content is added to the pile. CC BY 4.0

Other than a couple of interesting (archaeologically speaking) layers, like the burst of the dot-com bubble in the late 1990s or the rise of smartphones from 2007, it’s a pretty consistent looking formation.

And then, in a layer representing the early 2020s, something strange happens…

The era of digital plastic

Back to reality for a second, and to a physical geographical phenomenon. Over the past 60 years (or arguably, as far back as the Industrial Revolution), the impact of humanity on the environment has been so stark that it is possible our mark will be a permanent feature of the Earth’s fossil record. There’s no scientific consensus yet as to when this impact started, or even whether there really is a geographical epoch influenced by humans, but the “Anthropocene”, as an idea was coined in the 1980s and has had a profound impact on the way we view our interactions with the environment.

In the notion of the Anthropocene, it’s suggested that our environmental footprint might be measured by deposits in the Earth’s strata, from the byproducts of burning fossil fuels, to the effects of nuclear weapons. One key marker in the geographical layers could be plastic. The polymers in plastics are not biodegradable, and could feasibly leave their trace in hundreds of thousands of years.

While current archaeologists use stratigraphy to contextualise finds from human settlement spanning back thousands of years, the impact of plastics could feasibly last for much longer if “plastiglomerates” make their way into the geological record.

Back to the future, and our digital archaeologist.

The anomalous layer that starts in the early 2020s is made up of trace materials of “digital plastic”. A mass produced, synthetic form of data that like its physical counterpart doesn’t degrade much over time. By the late 2020s, the amount of digital plastic in the online strata outweighs the earlier shades of human generated content.

As the layer of digital plastic builds up in the digital ecosystem, it becomes more compressed and more homogenous, eventually flattening out to a sickly looking smear of uniform data. The rise of GAI content spells the end of the richer, human epoch of online data.

Stretching the plastic analogy

I’m just going to run with this one for a while. If our digital footprint can be compared to our physical impact on the environment, and if synthetic content online can be compared to plastic, then there are a few other angles we might consider.

Microplastics have been found in human blood, with a recent study finding plastic particles like PET from bottles have made their way into our bodies. We’ve known for years that using plastic can lead to ingesting particles of the chemical based products, and there are strong links between plastics and cancer. Microplastics also fill the ocean, affecting marine life and even infiltrating previously “pristine” environments like the waters of Antarctica.

So what about digital microplastics? There’s a couple of places I could go here, but beginning at the most literal level a digital microplastic would be a small, broken down component of the synthetic material.

It’s not that much of a stretch actually. An important aspect of pre-training language models is the initial tokenization – breaking down – of the data. Including synthetic data – digital plastic – in the training process has already been demonstrated to contribute to “model collapse” and the degradation of performance. It seems like GAI pollution might already be bad for the digital environment.

And what happens when digital microplastics enter our bloodstream?

The “metaverse” was much hyped in 2021 and 2022, not least of all by companies like Meta who have invested heavily in augmented and virtual reality. Beyond the hype, however, there is serious money and time being invested into the hardware and software needed to make the metaverse a reality. Much of this investment is focused on the intersection of artificial intelligence and AR/VR.

If our future does involve AI-augmented reality, then what will be the impact of an environment awash with generative AI content?

What happens when synthetic media interfaces directly with our brains, through technologies like Neuralink? We have no idea. Then again, we had no idea that plastics would be so problematic when they were first invented as an alternative to ivory.

Who cares about the digital ecosystem?

At this point, you might be asking “who cares?”. Maybe I’m stretching the analogy a little thin, and the comparison of the physical effects of the Anthropocene, from the Industrial Revolution, to nuclear war, to the impact of plastics has no analogue in the digital world.

But our attitude towards the physical environment has shifted over time, and that the way we understand our impact now is much different to our understanding in the 1950s and 60s. In the same way, perhaps in two or three decades we’ll be viewing the digital ecosystem through a much different lens.

There are already organisations dedicated to archiving and preserving the online environment. The Internet Archive and Wikipedia both have initiatives to document and safeguard online data for future generations. It’s not hard to imagine that future generations could use these archives in the same way as our imagined digital archaeologist, surveying the changes over time; in fact, such visualisation tools already exist and pull together data from the internet, radio, television, and other media.

Internet Archive visualisation tool of occurrences of the term Wikileaks on television

What does all of this mean for the future of the digital environment? Maybe organisations like The Internet Archive will morph into the digital equivalent of Greenpeace or the ACF. Perhaps we’ll see a generation of digital activists protesting against the impact of GAI on the digital environment just as current day climate activists call for action on our physical impact on the planet.

All metaphors aside, I think there’s a reasonable chance that the next few years will see a deterioration of the (already sketchy) quality of online materials. Cory Doctorow introduced us to the idea of the “enshittification” of online platforms in this fantastic article which is well worth a read if you’re concerned about the fact that the internet just ain’t what it used to be.

So, what can we do about it?

Just as with the impact of plastics on the environment, there are things we can do right now which might minimise the potential for harm. We’ve also got precedent for various approaches to limiting the negative impact of Generative AI, including from recent events worldwide. Here are a few suggestions:

  1. Media literacy education: In a digital age saturated with content, individuals must be equipped to discern fact from fiction. School programs, university workshops, online courses, or community sessions that focus on understanding the origins, biases, and potential misinformation in digital content can help users to become more discerning consumers of online information.
  2. Exploration of open-source platforms: Encourage users to engage with open-source platforms where transparency is prioritised. Right now, there are no foundation models that score higher than 54% on Stanford’s “transparency index”. Maybe in the future, developers will be compelled towards more transparency.
  3. Digital minimalism: This strategy focuses on mindful consumption. Just as we’re encouraged to reduce plastic usage in the physical world, in the digital realm, it’s beneficial to be selective about the content we consume and produce. This could involve decluttering digital spaces, or simple actions like unsubscribing from unnecessary newsletters, and being more intentional about the apps we download.
  4. Engagement with ethical content creation: The recent Writers strikes in the US have demonstrated the power of content creators to establish ethical and professional boundaries. Companies will also be compelled through consumer demand to produce more ethical products, just like Adobe’s response to the digital arts community with its “more ethical” Firefly model.
  5. Get involved: There’s always a place online for active participation rather than passive consumption. Get involved in conversations like this, and let people know where you draw the line with human versus synthetic content.

Ultimately, I hope that this strata of “digital plastic” doesn’t come to represent the next few decades of our online content. Realistically, I think that the output of Generative AI – impressive as it is – will lose its sway once people start to (re)value more meaningful content. 

I have no doubt that much of the content online already is, and will continue to be, produced by Generative AI. But hopefully, little pockets of human and human+AI writing will continue to provide meaningful information and entertainment online.

