Skip to Main Content

Gray Literature: Beyond Peer Review

Using Generative AI in Research

Generative AI generates new material based on content it has been pre-trained on. Image text duplicated below.

Artificial intelligence is a field of computer science that covers the theory and methods to build machines that think and act like humans. Machine learning is a type of artificial intelligence, which allows computers to learn from data without human programming. Deep learning is a type of machine learning that mimics the human brain using artificial neural networks such as transformers to allow computers to perform complex tasks. Generative AI like ChatGPT, Midjourney, and Bard generates new text, audio, images, video, or code based on content it has been pre-trained on.

The questions of generative AI's role in education, art, and society are ongoing. The information in this document is intended to highlight some common misconceptions about generative AI while promoting your capacity for using AI tools a helpful, ethical, and safe fashion. Knowing when not to use a tool is just as important as knowing how it works!

Ethics

Is it ethical to have ChatGPT write a course paper from scratch? Probably not. But questions of AI ethics go beyond claiming sole credit for machine-generated content. For one, ChatGPT isn't the only game in town: generative AI trained on LLMs are everywhere in educational and creative technology, including numerous applications designed to increase productivity and support innovation. Top issues include:

  • Many AI tools are trained using text copied from books and the general internet. Many authors and visual artists do not consent to their work being used in this way, claiming invasion of privacy and copyright infringement. Reddit, for example, now charges AI companies to train their models on users' posts.
  • The US Copyright Office has ruled that material created by generative AI cannot receive a copyright, since "no court has recognized copyright in a work originating with a nonhuman".
  • The algorithms that control AI tools often receive additional guidance from human trainers. These trainers are often at risk of viewing dangerous and hateful material, as well other forms of worker exploitation. AI trainer efforts to unionize highlight the invisible, human backbone supporting AI tools.
  • Generative AI was also a hot topic in the recent SAG-AFTRA strike, culminating in increased protection for writers concerned about studios adopting generative AI for producing scripts or other story elements.
  • Where is your line between helpful AI (useful tools) and dangerous AI (exploitative programs)? Where do your instructors and peers agree? Where do they differ?

When using generative AI tools for academic coursework, it's always safest to consult your instructor or course syllabus for guidance. Since the topic is especially controversial at the moment, use special care when there's a risk of machine-generated work being mistaken for material that was created without the input of AI tools.

Privacy

Many generative AI developers train their models on user input. In other words, the text that you feed into a generative AI model may be stored and shared with others to improve the product. When assessing a new AI tool, it's a good idea to read its privacy policy to see what user data the developers keep and how they intend to use it. Look for phrases like:

  • "Our developers may receive details about your interactions..."
  • "We're constantly improving thanks to your training information."
  • "We reserve the right to train our model on user information."

Sharing privileged information including medical, legal, financial, or personal information with a generative AI tool is a considerable privacy concern.

Challenges

The most practical consideration to keep in mind when using generative AI to create something is that these tools are not actually mimicking human thought. Instead, they're determining the statistical frequency of what words are most likely to complete your prompt. Analysis, critical thinking, and subject knowledge as they're understood in academia and many other professions do not enter the equation.

For example, if you ask ChatGPT to give you a soup recipe, the output is not determined through a careful consideration of flavor profiles, optimal cooking times, or chemical interactions. Instead, the output is derived from the large language model's training data, which identifies the probabilities of certain text strings existing within the material identified as soup recipes on cooking websites. Please independently verify the edibility of the results before eating ChatGPT soup!

Generative AI's flexibility is based on having a broad collection of training materials. The provenance of these training materials is often hidden. However, we do know that the current version of ChatGPT's training materials are all from 2021 or earlier.

Software licenses, especially for free software, will increasingly grant the creators the right to use your data for training AI models.

Fabrications and Hallucinations

One of the most immediate challenges of working with generative AI is that they often produce factually incorrect text. Alternately called "fabrications" or "hallucinations", these bits of text often look or sound believable. However, many AI tools don't have a trustworthy level of safety tools or fact-checking. In academic research, fabrications frequently appear in citations. Examples of this include ChatGPT "making up" sources that don't exist and linking sources to incorrect authors.

Can I Use AI for This? Should I?

That is a very complicated question. There are many ethical concerns and potential challenges when using generative AI, but also a number of benefits. This flowchart was designed with ChatGPT in mind, but provides a helpful model for working with any generative AI.

Using ChatGPT with Caution

Flowchart for determining when it's safe to use Chat GPT. "Yes" if it doesn't matter if the output is true. "Possibly" if you have the expertise to verify the output and take responsibility for inaccuracies. "No" if you do not have the expertise to verify the output or are unwilling to take responsibility for inaccuracies.

Prompt Possibilities

With the above limitations in mind, generative AI may be useful for many tasks. Getting the most out of the tools often requires effective prompt creations. In other words, what are you putting into the tool to interpret? An effective prompt is specific in scope and includes any requirements for the expected output.

Just like you shouldn't stop at the first Google search result, it's best to try multiple prompts with AI tools. Respond to the output that you receive, interrogating and testing the results where possible. Your goal is to identify potential errors and guide the AI tool toward providing material that you can use.

Prompt Examples and their Functions

 Purpose

 Prompt

 Feedback and self-assessment  "Provide feedback on the following writing sample: [insert sample]. Specify how well the writing sample meets the following criteria [insert criteria/rubric] and give me suggestions for how I can improve my writing."
 Personalized/adaptive learning  "I am trying to improve my knowledge of [subject/task]. Ask me a question about [subject/task/topic] and keep asking me adaptive questions to help me improve."
 Explain/summarize information  "Explain the concept of plate tectonics as if I am in fifth grade."
 Role-playing  “I will play the role of clinician and you will play the role of a patient, a 55 year-old female adult with recent onset of chest pain. Rate my conversation with the patient based on [enter criteria or framework]. 
 General research query  "What are five steps environmental agencies can take to establish    trust among at-risk populations?"

Further Reading

For further information on generative AI at OSU, check out the AI@OSU and Ecampus's guide to AI literacy.

"Using ChatGPT with Caution" and "Prompt Examples and their Functions" are adapted from the AI Literacy Module developed by the Rush University's Center for Teaching Excellence and Innovation. They are used under a CC BY-NC-SA License.