2.1 What is Generative AI?
Generative AI (genAI) consists of complex artificial intelligence systems. A number of models form the foundation of such systems, the most widely used being large language models (LLMs), diffusion models, and generative adversarial networks (GANs) (Wilson, 2024). While early versions of these tools have been with us for some time in the form of chatbots and rudimentary AI assistants like Siri and Alexa, those emerging since November 2022 eclipse previous tools in terms of their range of capabilities, knowledge bases, and ability to respond to human direction. Moreover, models like ChatGPT, Gemini, and Llama 3 are being integrated with communication tools humans use on a daily basis (like Microsoft Office and Apple products, Google’s office suite, along with social media tools like Instagram and Facebook), making them ubiquitous. Despite their current flaws, they are also beginning to form the foundation for other AI tools used daily in technical environments, and it is essential for us to become familiar with the strengths and weaknesses of such tools and how we can integrate them usefully in our daily workflows.
Before discussing genAI, we should consider it in the context of the long history of AI.
What is Artificial Intelligence (AI)?
Artificial intelligence (AI) is the theory and development of computer systems capable of performing tasks that normally require human intelligence, such as interpreting language, recognizing patterns from large amounts of data, and making decisions. Some of the names given to AI, based on the way in which it is designed and what it can do, include neural networks, natural language processing, computer vision, speech recognition, machine learning, and deep learning.
Take a look at the timeline in Figure 2.1.1 to explore a selection of the key advances in AI.
While we tend to think of artificial intelligence as a product of the 21st century, it has been in practical use since the middle of the 20th century. We interact with AI every day. Examples of AI include:
- Asking your smartphone to unlock your phone by recognizing your face.
- Navigating to your destination using apps like Google Map or Waze to find the quickest route.
- Getting more posts in your social media feeds that match those with which you previously interacted (that you liked or commented on).
- Receiving a notification from your bank that there has been unusual activity in your account.
- Obtaining a recommendation from an online store (or music or video streaming platform) based on your previous purchases.
- Interacting with a customer service chatbot.
- Feeding your text through a grammar software that suggests better ways to write your text.
- Using Google Translate to translate text from one language into another.
- Using a voice-to-text app on a smartphone.
- Using a personal assistant like Siri, Alex, or Cortana.
Knowledge Check
What is Generative AI (GenAI)?
While we interact with AI everyday, usually it hides in the background of the technologies we use. GenAI is different in that we can use our voice and text to command the technology to produce content (a.k.a. as output) that we would normally have to create ourselves.
Miao and Holmes (2023) define genAI as follows:
Artificial Intelligence (AI) technology that automatically generates content in response to prompts written in natural language conversational interfaces. Rather than simply curating existing webpages, by drawing on existing content, genAI actually produces new content. The content can appear in formats that comprise all symbolic representations of human thinking: texts written in natural language, images (including photographs to digital paintings and cartoons), videos, music and software code. GenAI is trained using data collected from webpages, social media conversations and other online media. It generates its content by statistically analyzing the distributions of words, pixels or other elements in the data that it has ingested and identifying and repeating common patterns (for example, which words typically follow which other words).
Figure 2.1.2 can help us understand genAI when we visualize it within the context of developments in AI.
While OpenAI’s ChatGPT was the first large language model to be popularized, other notable text and code generating models, known as frontier models, have been released, namely, Google’s Gemini (formerly Bard), and Anthropic’s Claude, and Meta’s Llama, along with open source models like Mistral/Mixtral.
Frontier Models | Applications |
OpenAI’s ChatGPT | Multimodal (text, image, voice, data); integrated into Microsoft products |
Anthropic’s Claude | Multimodal |
Meta’s Llama | Multipurpose; integrated into Instagram and Facebook |
Google’s Gemini | Multimodal; integrated into Google products |
Mistral/Mixtral Open Source | Language processing only |
At this writing, ChatGPT 4 is considered the gold standard, with both Gemini (select versions) and Claude achieving GPT 4-level performance in 2024 (Mollick, 2024). In addition, image-generating models, such as DALL-E, Stable Diffusion, Stable Cascade, and Midjourney now offer text-based command modes for creating custom images. OpenAI has also released Sora, which can create realistic videos from text commands. The models are rapidly evolving and new versions are released every few months. Each model and application displays differing strengths and capabilities as noted by end-user testing (see the work of Ethan Mollick in particular). In addition, given the rapid developments occurring in the field of genAI, the models and applications are evolving equally rapidly, while new ones are being released.
OpenAI’s ChatGPT draws from a genAI model that was trained on a large number of datasets and documents available on the internet to identify and predict how language works (how to create sentences). The GPT in ChatGPT stands for “Generative Pre-trained Transformer”, referring to a type of machine learning involving a neural network in which a computer learns to perform a task by analyzing training examples. For example, the training data used by ChatGPT 3.5 is a dataset that includes 570GB of data from sources like books, Wikipedia, articles, and other pieces of writing on the internet up to 2021 (Gupta, 2023). The GPT model learned the probability that one word (or a part of a word also known as a token) should follow another, based on context or conversation topic. This ability to predict sentence patterns enables it to generate outputs that seem surprisingly “human” – conversational and coherent.
Users engage with the genAI in the form of a written or verbal conversation often referred to as prompts. The GPT acquired the conversational ability through a process of “Reinforcement Learning with Human Feedback (RLHF) – a method that uses human demonstrations and preference comparisons to guide the model toward desired behaviour” (OpenAI). Essentially, RLHF involves using human feedback to train the model in giving outputs that are socially acceptable—imperfect though they may sometimes be.
Fundamentally, tools like Copilot, ChatGPT, Gemini, etc. are designed to employ your prompts to generate a response based on context and conversational topic. You ask questions (prompts), upload reference images or documents, and ask follow-up questions to fine-tune the GenAI’s responses until you obtain output or content that meets your needs.
Interestingly, giving the same prompt to the same genAI tool does not always produce the same result. Each time a tool like Copilot or ChatGPT is prompted, it often creates a new output. While there will probably be similarities between outputs from the same prompt on the same topic, they will not be identical. This is even more apparent when you use the same prompt across multiple AI products.
GenAI is evolving as we write, and capabilities increase with each version of the models that are released. Below you will find a list of basic capabilities for text-based modalities that form the basis for more complex multimodal capabilities being released. These multimodal capabilities in ChatGPT, Gemini, and Claude include the following:
- Verbal capabilities: Interact with the user verbally using natural language, cadence, tone, and pace.
- Visual capabilities: Make use of the user’s camera to engage with dynamic visual elements in the world (not just photos and documents).
- Emotion and facial expression detection: Interpret facial expressions and identify emotions.
- Memory: Remember your conversations unless settings have been adjusted so that it does not do so.
Basic Capabilities
While each text-based genAI tool has specific functionalities, some common capabilities include:
- Create informative, well-written text: prose, poetry, dialogue, code, business reports, correspondence
- Provide examples and references; Note: references may be ‘hallucinated’ (which occurs when the model offers fake or inaccurate information all the while sounding knowledgeable)
- Generate outlines, questions, tables, long form text
- Render images
- Analyze large data sets
- Summarize inputted text
- Provide feedback on text – both form and structure
- Explain concepts at different levels of understanding
- Translate between languages
- Remember information both within a thread and across threads
Knowledge Check
Seneca Libraries offers an overview of genAI that covers a description, uses, basic prompting, and citation practices: Generative Artificial Intelligence
Also take a look at this video, Introduction to AI for Teachers and Students, by Ethan Mollick and Lilach Mollick of the Wharton School at the University of Pennsylvania. These pioneers in the use of genAI make the topic easy to understand.
Attributions
This page has been edited and remixed from the following sources. Supplemental information has been provided by Robin L. Potter.
Center for Faculty Development and Teaching Innovation. (2023) What is Generative AI? Centennial College. What is GenAI? – Generative Artificial Intelligence in Teaching and Learning (pressbooks.pub) CC by 4.0.
Coley, M., Snay, P., Bandy, J., Bradley, J., Molvig, O. (2023). Teaching in the Age of AI. Vanderbilt University Center for Teaching. https://cft.vanderbilt.edu/guides-sub-pages
Elkhoury, E. and Prud’homme-Généreux, A. Future Facing Assessments by is licensed under CC BY 4.0
Hu, K. (2023). ChatGPT sets record for fastest-growing user base – analyst note | Reuters
Paul R MacPherson Institute for Leadership, Innovation and Excellence in Teaching. (2023). Generative Artificial Intelligence in Teaching and Learning at McMaster University McMaster University. CC by 4.0.
Miao, F. and Holmes, W. (2023). 5.2 A ‘human-centred and pedagogically appropriate interaction’ approach. Guidance for generative AI in education and research. UNESCO. unesdoc.unesco.org/ark:/48223/pf0000386693/PDF/386693eng.pdf.multi CC by 3.0.
OpenAI. (2024). How does ChatGPT work? What is ChatGPT? | OpenAI Help Center
The University of Queensland Brain Institute. (n.d.). The history of artificial intelligence. Image. The University of Queensland. History of Artificial Intelligence – Queensland Brain Institute – University of Queensland (uq.edu.au)
References
Appel, G., Neelbauer, J., and Shweidel, D. A. (2023). Generative AI Has an Intellectual Property Problem (hbr.org)
Center for Faculty Development and Teaching Innovation. (2023) Generative Artificial Intelligence in Teaching and Learning – Simple Book Publishing (pressbooks.pub) Center for Faculty Development and Teaching Innovation. Centennial College. CC by 4.0.
Future of Life Institute. (2023). Pause Giant AI Experiments: An Open Letter – Future of Life Institute
Gupta, A. (2023, April). What is ChatGPT and how was it trained? – Paperpal Blog | Paperpal
Mollick, E. and Mollick, L. (2023). Practical AI for instructors and students Part 1: Introduction to AI for teachers and students. https://youtu.be/t9gmyvf7JYo?si=wWzwn8DsSd1FUxWN
Mollick, E. (2024). Which AI should I use? Superpowers and the State of Play (oneusefulthing.org)
Moran, C. (2023). ChatGPT is making up fake Guardian articles. Here’s how we’re responding | Chris Moran | The Guardian
OpenAI. (2023, March 14). GPT-4 (openai.com)
OpenAI. Privacy policy (openai.com)
Perrigo, B. (2023). OpenAI Used Kenyan Workers on Less Than $2 Per Hour: Exclusive | TIME
Piper, K. (2023). A guide to why advanced AI could destroy the world – Vox
Sampson, P. (2023). On Advancing Global AI Governance – Centre for International Governance Innovation (cigionline.org)
Seneca Libraries. (2023). Home – Generative Artificial Intelligence – LibGuides at Seneca Libraries (senecapolytechnic.ca)
Shine, I. and Whiting, K. (2023, May 4). The jobs most likely to be lost and created because of AI | World Economic Forum (weforum.org)
Stokel-Walker, C. (2023). The Generative AI Race Has a Dirty Secret | WIRED UK
Wilson, S. A. (2024). Facebook comment. Higher Ed Discussions of AI Writing & Use. August 28, 2024.