Issues Surrounding Generative AI Use

"I believe that when both educators and learners improve their critical AI literacy, they will be better able to resist using AI in ways that are harmful or inappropriate. Instead, they will feel empowered to use it constructively, while being aware of the limitations."

-Maha Bali, 2024 (Where are the crescents in AI? | LSE Higher Education Links to an external site.)

As Generative AI has become increasingly prevalent, new questions and uncertainties are surfacing regarding its use. As educators, it's important to stay informed about the challenges, opportunities, and ethical concerns that students might encounter. The content below aims to highlight some of the major issues related to generative AI use, though it is not exhaustive in terms of listing or elaborating on these major issues. Our goal is to lay the groundwork and offer additional resources for those who would like to learn more. We've also included ideas for reflection questions that you can use to begin important conversations with your students surrounding these topics.


Privacy & Data Collection

The Issues

Every time we (and our students) use an AI tool, the companies who developed those tools collect information about us. For example, OpenAI's Privacy Policy Links to an external site. states that they collect data that includes account information, IP addresses, device information, browser information, type of content that users engage with, features used, actions taken, and dates and times of access. OpenAI further informs that they may use that personal information for activities such as conducting research, developing new programs and services, and carrying out business transfers. Additionally, they state that they may provide personal information to third parties, including vendors, service providers, government authorities, and industry peers. Consequently, with each use of ChatGPT, users are providing OpenAI with data that can be used to improve their AI models and enhance their business. 

Students might not be comfortable with the idea, or realize, that AI tools are tracking, collecting, and sharing their personal data. For this reason, instructors should be cautious about requiring students to create accounts or use personal devices for assignments that require AI use. Additionally, OpenAI's Privacy Policy states that anyone under the age of 18 must have permission from a parent or guardian to use their services (and that their "service is not directed to children under the age of 13").

Dr. Torrey Trust Links to an external site. highlights privacy and data collection policies for ChatGPT, Google Gemini, and Microsoft Copilot in her comprehensive resource about generative AI and ethics (AI & Ethics Presentation by Dr. Torrey Trust - Google Slides Links to an external site.) and she lists the following key points:

  • ChatGPT requires parental permission for 13–18-year-old users, Gemini and Copilot do not.
  • ChatGPT and Gemini can give away any data collected to “affiliates,” including, if requested, to federal authorities. 
  • Microsoft & Google have more data privacy protections for users.
  • Google tracks user location, OpenAI collects IP address, Microsoft CoPilot doesn’t seem to collect any location data. 
  • Don’t let students put in any sensitive or identifying information into any of these tools!
  • Don’t put any sensitive information in these tools (e.g., asking ChatGPT to write an email to a student about their grade - this is FERPA violation). 
  • Any information input into these tools (e.g., any prompts they write) is data that can be used by the companies that made the tools. 

Learn More

Questions for Students to Consider

  • What types of personal data are you comfortable sharing, and with whom are you willing to share it?
  • What factors or motivators would influence your decisions to share more or less personal data?

Use the following digital drag-and-drop exercise with your students to encourage exploration and discussion of the questions above: Data, Privacy, and Identity Drag and Drop Cards – Technoethics DigCiz Links to an external site. 

Intellectual Property and Copyright 

The Issues

There are two main issues related to intellectual property and copyright; one is related to the AI-tools' training data (i.e., what goes into the AI tool) the other is related to content produced using AI (i.e., what comes out of the AI tool).

Generative AI has been trained using text and images from the internet, in many cases without consent of the original content creators. The AI-generated output is then based upon, and can closely resemble, the work of artists and writers; however, the sources from which the output is derived is not cited or acknowledged and so those who created the original works are not given credit for their intellectual property.

Ethical alternatives for AI training data do exist. Organizations like Fairly Trained Links to an external site. offer certifications for AI models that ensure the AI is only trained on properly licensed data.

Additionally, those who generate content using AI are unable to copyright that content. Under U.S. law, something must be created by a human to receive copyright protection. In other words, if someone develops an image, text, or video using AI tools they will not be able to claim ownership of that work. Instead, it might be considered either a public domain work or a derivative of the AI tool's training dataset (Lucchi, 2023). Overall, though, it's important to recognize that the issues surrounding AI and intellectual property are nuanced and rapidly developing.

Learn More

Questions for Students to Consider

  • If incorporating AI-generated content into your academic submissions, how could you distinguish between your original ideas and those produced with the help of AI, and would this matter? 
  • If incorporating AI-generated content into your academic submissions and the AI tool you’re using draws upon copyrighted material, how would you navigate the ethical considerations of presenting this content as part of your own work? What are some measures you could implement to ensure transparency and proper acknowledgment of the intellectual contributions involved?

References

Lucchi, N. (2023). ChatGPT: A Case Study on Copyright Challenges for Generative Artificial Intelligence Systems. European Journal of Risk Regulation, 1-23. https://doi.org/10.1017/err.2023.59 Links to an external site.

Exploitive Labor Practices

The Issues

When an LLM or other generative AI model is being developed, it usually goes through repeated training cycles to refine its outputs. First they're trained on a dataset of some kind, like a library of text or images. Then they are trained and refined with thousands of hours of human interaction. Human data workers are hired to interact with the AI. They enter a huge variety of prompts and questions, and give it feedback when responses are good or bad. These interactions include prompting the AI to produce output that is harmful in some way, in order to train guardrails into the system. Usually these data workers live in developing countries with few labor protections and extremely low wages.

As reported by Time Magazine and the Wall Street Journal, when OpenAI was developing ChatGPT's language model it relied heavily on data workers from Kenya. These workers were paid less than $2 an hour to spend up to 9 hours at a time, reading and tagging text inputs and outputs that contained negative or harmful content. This included racist and sexist content, sexually explicit content, descriptions of violence and sexual assault, and other objectionable material. Many of those Kenyan workers reported being traumatized by this work, and received little or no mental health support from their employer. Data worker exploitation isn't limited to AI development, it's a widespread issue in content moderation across nearly all social media and content platforms online Links to an external site. (Content warning: this article contains written descriptions of violent content.)

There are alternative training methods being explored that don't rely on exploitive labor practices. For example, Anthropic's Claude LLM is being trained using a "constitution" of ethical principles and priorities, without relying on underpaid human data workers.

Learn More

Questions for Students to Consider

  • Is it ethically acceptable to use an AI tool that was developed using problematic labor practices?
  • What should the responsibility be for big tech companies to highlight practices deemed exploitive and what level of regulation if any would be required to ensure ethical practices?
  • In what ways might this issue parallel the history of other forms of technology or research? Are there other tools or knowledge that we rely on that was developed using exploited labor or research subjects? What should be done about it?
Bias

The Issues

Large language models inherently contain bias as they are trained on internet webpages that were authored by humans, who themselves are susceptible to bias. Furthermore, these models reflect biases stemming from the disproportionate availability and prevalence of certain types of content over other types, tending toward less representation of minority groups or marginalized voices. Bender et al. (2019) point out that "training data has been shown to have problematic characteristics resulting in models that encode stereotypical and derogatory associations along gender, race, ethnicity, and disability status."

OpenAI, for example, acknowledges that ChatGPT is biased Links to an external site. and offers the following elaboration: "The model is skewed towards Western views and performs best in English. Some steps to prevent harmful content have only been tested in English. The model's dialogue nature can reinforce a user's biases over the course of interaction. For example, the model may agree with a user's strong opinion on a political issue, reinforcing their belief. These biases can harm students if not considered when using the model for student feedback. For instance, it may unfairly judge students learning English as a second language."

As educators, it's important to talk with students about these built-in biases in the context of their own learning and development as well as the societal perpetuation of bias. According to Scheuer-Larsen (2023) Links to an external site., "Informing and teaching about the potential biases of technologies, the need for critical thinking, validation of information, and healthy skepticism, might be a good place to start!"

Learn More

Question for Student to Consider

  • How can you develop awareness of the biases present in language models like ChatGPT? What strategies can you use to critically evaluate the information provided by such models?
  • In what ways might biases in language models impact your learning experience or influence your perspectives? How can students and educators address these ethical implications in the classroom?

References

Bender, E. M., Gebru, T., McMillan-Major, A., & Schmitchell, S. (2021). On the Dangers of stochastic parrots: Can language models be too big? FAccT'21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623. https://doi.org/10.1145/3442188.3445922 Links to an external site.

Accuracy

The Issues

Generative AI models have the ability to create new content such as text, images, and audio, based on patterns learned from existing data. Because these models generate content based on patterns and predictions, they are not capable of evaluating meaning or accuracy (Hannigan et al, 2024). Consequently, their output isn’t always reliable, even if it sounds convincing. The term "hallucination" refers to false or misleading information produced by generative AI models (e.g., Hannigan, et al., 2024; Maleki et al., 2024). AI hallucinations can result for several reasons, such as the quality and recency of the training data as well as the complexity and training of the model (Hannigan et al., 2024). If the training data is biased or incomplete, the generated content could reflect that (Bender et al., 2021). It is, therefore, critical to carefully fact check generative AI output. In OpenAI's Terms of use Links to an external site., OpenAI cautions that "Output may not always be accurate. You should not rely on Output from our Services as a sole source of truth or factual information, or as a substitute for professional advice." Google's Gemini Links to an external site. and Claude's Anthropic Links to an external site. also include statements that caution users about inaccurate output in their FAQs (Gemini) Links to an external site. and Terms of Service (Anthropic) Links to an external site. Links to an external site.documentation, respectively. Understanding the limitations and context-specific accuracy of generative AI output is essential for responsible use. 

A second issue related to accuracy of AI tools surrounds the use of AI detection tools to identify plagiarism, such as those sometimes used in an academic setting. AI detection tools are trained on both human- and AI-generated text and are designed to identify and categorize new text as either AI-generated or human-written. Elkhatat et al. (2023) investigated the ability of several AI detection tools to differentiate between human- and AI-generated content and conclude that "While...AI-detection tools can distinguish between human and AI-generated content to a certain extent, their performance is inconsistent and varies depending on the sophistication of the AI model used to generate the content. This inconsistency raises concerns about the reliability of these tools, especially in high-stakes contexts such as academic integrity investigations " (p. 12-13). Overall, differentiating between AI-generated and human-written content cannot be done with full accuracy or confidence using AI detection tools.

Learn More

Questions for Students to Consider

  • How can you verify that the information produced by generative AI tools is accurate before using it in your academic work?
  • What ethical responsibilities do users have when sharing or relying on AI-generated content?
  • How might your ability to critically evaluate AI-generated output affect your preparedness for the future job market? 

References

Bender, E. M., Gebru, T., McMillan-Major, A., & Schmitchell, S. (2021). On the Dangers of stochastic parrots: Can language models be too big? FAccT'21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610-623. https://doi.org/10.1145/3442188.3445922 Links to an external site.

Elkhatat, A. M., Elsaid, K. & Almeer, S. (2023). Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text. Int J Educ Integr 19, 17. https://doi.org/10.1007/s40979-023-00140-5 Links to an external site.

Hannigan, T., McCarthy, I. P., & Spicer, A. (2023), Beware of Botshit: How to Manage the Epistemic Risks of Generative Chatbots. Business Horizons, Forthcoming, http://dx.doi.org/10.2139/ssrn.4678265 Links to an external site.

Maleki, N., Padmanabhan, B., & Dutta, K., (2024). AI hallucinations: A misnomer worth clarifying. arXiv:2401.06796v1 [cs.CL] 09 Jan 2024. https://arxiv.org/html/2401.06796v1 Links to an external site.

Environmental Impacts

The Issues

Training and using AI requires a significant amount of energy, which leads to substantial carbon dioxide emissions. Strubell et al. (2019) estimate that training one natural language processing (NLP) model emits approximately 626,155 lbs CO2 equivalent, which is approximately equal to the lifetime impact of five cars. As model sizes increase, the amount of energy required for training them also increases and many models are continuously fine-tuned or re-trained several times. Additionally, this carbon emission estimate only considers emissions produced during training, and not the model’s full life cycle. However, a lack of transparency from some AI companies makes it difficult to understand the carbon footprint of those models and to compare across different AI models.

To fully wrestle with the environmental impacts of AI, though, we must consider more than just energy consumption and CO2 emissions related to AI model training and use. A bigger picture also involves consideration of the efficiencies and innovations that arise as a result of AI use. For example, some innovations lead to activities that indirectly increase greenhouse gas emissions while others help reduce emissions and solve environmental problems. The articles below provide further discussion of some of the issues addressed above.

Learn More:

Questions for Students to Consider:

  • Do you think the substantial amount of energy required to develop and use AI is balanced or can be balanced with the innovative potential of AI to solve or alleviate environmental challenges?
  • What are some approaches that can be taken to align AI's energy requirements with its potential to contribute to environmental problem-solving?

References

Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP.  In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, Florence, Italy. Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1355 Links to an external site.

Tomlinson, B., Black, R. W., Patterson, D. J., & Torrance, A. W. (2024). The carbon emissions of writing and illustrating are lower for AI than for humans. Scientific Report, 14, 3732. https://doi.org/10.1038/s41598-024-54271-x Links to an external site.

Data Manipulation & Disinformation

The Issues

One doesn't have to look far to see the clear implications of generative AI on the landscape of communication. The spread of disinformation, or the deliberate sharing of misleading information Links to an external site. isn't anything new, but the ease at which Generative AI can create content without context brings new found opportunity in the manipulation of a message. For example, back in 2017 researchers were able to use AI technology to generate video from audio recordings of then U.S President Barack Obama. The implications were staggering at the time, with folks concerned about being able to discern between what was real and fake, and how bad actors may be able to make public figures say, or do, just about anything. Today we see generative AI muddying the waters in everything from global conflicts Links to an external site. to the current election cycle Links to an external site.. The implication for us and our students is to revisit how we critically assess for information. Deepfakes and disinformation will only become harder and harder to spot, so in all fields students will need to understand the polarizing topics and the motivations of a source before they can engage with it appropriately. The materials collated by The News Literacy Project Links to an external site. are a great way of setting the tone of this issue, which contains a list of examples of AI generated disinformation from the latest news cycle.

Learn More

Questions for Students to Consider:

  • Where do you think AI generated disinformation is likely to occur? What makes those areas standout?
  • What do you think is a major motivating factor for individuals to use these tools for disinformation?
Context

The Issues

When we talk about AI and context, what we are really discussing is when it is appropriate to use AI, and when it isn't. It may be important that the audience of your generated content cares deeply about where it is coming from, or wether it is human generated or AI. The connection to business and marketing is clear, with the potential to generate ads quickly and at low cost, adoption has been climbing steadily Links to an external site.. Google has already seamlessly implemented AI tools in how advertisements are shown Links to an external site. in it's search engine, with advertisement companies integrating AI generated images and videos into their own materials

If the Music you listen to Links to an external site., the videos you watch Links to an external site., or photos you see Links to an external site. are AI generated, does it change how you interact with the information it conveys? In these instances context matters. The research publication Nature has recently restricted the use of AI because it didn't satisfy their authorship requirements Links to an external site., mainly that there needs to be accountability for a published work by a human author, which an AI written resource can not provide. As an additional example, written communications coming from AI instead of an individual can undermine messaging meant to be heartfelt or personal. At Venderbilt University, chatGPT was used to write an email regarding a recent mass shooting Links to an external site. which set off a wave of criticism regarding appropriate use. As the tools get better and better at creating more realistic content with greater ease, the lines of appropriate context will get blurrier. It is important to be able to assess appropriate use now, before it becomes too difficult to distinguish and make calculated ethical decisions regarding use later on.

Learn More

Questions for Students to Consider

  • What contexts seem appropriate for AI to be used? Which ones feel out of touch or manipulative?
  • Does it change your view of a work, product, or service, to know it was generated by AI and not a human being? Why or why not?

Click Next for a discussion on Opportunities and Challenges with Generative AI