Teaching AI Ethics: Datafication

This is the sixth post in a series exploring the nine areas of AI ethics outlined in this original post. Each post goes into detail on the ethical concern and provides practical ways to discuss these issues in a variety of subject areas. For the previous post on privacy, click here.

“Datafication” is a term used to describe how all aspects of our lives are being turned into datapoints. Whether through the collection of our likes, shares, and ratings on social media and streaming apps, or through the harvesting of physical data from devices like smartphones and smartwatches, datafication is what powers artificial intelligence. In the words of British data scientist Clive Humby, “Data is the new oil”.

Datafication has become a defining characteristic of our modern world, as technology advances enable the collection, storage, and analysis of vast amounts of data from nearly every aspect of our lives. While this process has led to numerous benefits such as improved efficiency of services, better decision-making, and increased personalisation of products and services, it also raises significant ethical concerns. In this post, I’ll go into the ethical implications of datafication, exploring the impact on privacy, surveillance, and potential misuse of data, as well as examining the responsibility of organisations and individuals in ensuring ethical data collection practices.

Here’s the original PDF infographic which covers all nine areas of AI ethics:

Case study: The datafication of education

Datafication sounds like a complex term – or perhaps something in the realm of conspiracy theorists – but it’s alarmingly simple for companies and developers to collect data about many aspects of our lives. With the rise of 1:1 devices in schools and the increased prevalence of students carrying their own phones, it becomes even easier. In fact, some educational apps collect an incredible amount of data on students.

A study by Atlas VPN found that 98% of iOS educational apps gather user data, with each app on average collecting information from over 8 data segments. This can include names, emails, phone numbers, location, payment information, and search history, among others. Duolingo, a popular language learning app, topped the list by collecting user data across 19 segments. Other notable data-hungry apps include Busuu, another language learning app, and Google Classroom, a learning platform, both of which collect data from 17 segments.

The study analysed the App Store privacy labels of 50 popular iOS apps in the education category, ranking them based on the number of personal user information segments collected. However, it should be noted that some apps collect data that cannot be linked back to a user’s identity, and these data segments were not included in the total count. The primary purposes for collecting data were app functionality (86%), analytics (80%), personalisation (56%), and developer’s advertising or marketing (54%). However, 24% of the apps also used collected data for third-party advertising, passing user information on to other orgaisations.

It’s worth noting of course that Atlas VPN has its own agenda for the research: Virtual Private Networks are used to circumvent data gathering for all kinds of purposes, and Atlas VPN uses its research to support the sales of its own products. Nevertheless, there’s still a huge quantity of personal and identifying data being gathered by these apps.


Where does all the data go?

To understand why these companies collect so much data, we can return to machine learning and artificial intelligence. Companies gather large amounts of information for various reasons, including for less-than-wholesome purposes. AI plays a significant role in many of these processes. Here are some reasons for extensive data collection that may raise ethical concerns, along with the role of AI in each:

  1. Targeted Advertising: One of the primary reasons companies collect user information is to deliver targeted ads that cater to individual interests, increasing the likelihood of engagement and conversion. AI algorithms analyse the collected data to predict the most effective ads for each user, maximising the return on investment for advertisers. However, this practice can lead to intrusive and privacy-invading ad experiences, as well as the potential for manipulation and exploitation of user data.
  2. Competitive Research and Development: Data collection provides valuable insights for companies to develop new products, features, and services that cater to user needs and preferences. AI can analyse the data to identify gaps in the market or predict future trends, helping companies stay competitive and innovative. However, this practice can lead to aggressive competition, IP theft, or unethical business practices aimed at undermining competitors.
  3. Surveillance and Profiling: Companies may use collected data to create detailed user profiles for various purposes, including targeted advertising, risk assessment, or even political manipulation. AI can process and analyse vast amounts of data, identifying patterns and trends that allow for the creation of comprehensive user profiles. This practice raises significant privacy concerns and can lead to discrimination, manipulation, and other unethical uses of personal information.
  4. Third-Party Data Sharing: In some cases, companies may share collected user data with third parties, such as data brokers, advertisers, or other business partners, without users’ explicit consent. AI can help to identify valuable data points or user segments for monetisation, which might result in increased sharing of personal information. This practice raises concerns about data privacy and the potential misuse of personal data by third parties.
  5. Unfair Competitive Advantages: Companies with access to vast amounts of user data may leverage AI to gain unfair competitive advantages, such as predicting and influencing user behaviour, dominating market segments, or exploiting network effects to create monopolies. This can stifle innovation, reduce consumer choices, and lead to market imbalances.
English Teachers: EOIs for Cohort 1 of the Practical Writing Strategies course are now open

Teaching AI Ethics

Each of these posts will expand on the original and offer a few suggestions of how and where AI ethics could be incorporated into your curriculum. Every suggestion comes with a resource or further reading, which may be an article, blog post, video, or academic article.

  • History: How does datafication impact the way we study and interpret historical events? How can we ensure the validity of data-driven historical research and avoid misrepresentations of the past?
  • English: How does datafication influence the way literature is analysed and interpreted? What are the ethical implications of using data-driven methods to study literary works and promote academic integrity?
  • Mathematics: How has datafication transformed the study of mathematics and careers in maths? How can we ensure ethical data collection and analysis in mathematical research?
  • Environmental Science: How does datafication impact the study of environmental systems and the interactions between humans and the environment? What are the ethical considerations in using data-driven methods to analyse environmental issues and promote sustainable solutions?
  • Visual Arts: How does datafication influence the creation, interpretation, and distribution of visual art? What are the ethical implications of using data-driven methods in artistic practice, and how can we promote integrity in digital art creation and distribution?
  • Geography: How does datafication impact the study of geographical patterns and processes? What are the ethical considerations in using data-driven methods to analyse and interpret geographical information?

The next post in this series will start the ‘advanced’ series, exploring the impact of facial and affect recognition. Join the mailing list for updates:

Success! You're on the list.

Got a comment, question, or feedback? Get in touch:

One response to “Teaching AI Ethics: Datafication”

Leave a Reply