Academic Integrity Digest: September 2024

Welcome to the Fall edition of the Academic Integrity Digest, and welcome back to campus for the 2024/25 academic year.

This issue’s theme continues to explore the intersection of academic integrity and generative AI. Specifically, the editorial takes a deeper look into tools for detecting the use of artificial intelligence and presents research on their efficacy, suitability and performance in detecting academic misconduct. In the resource section, you will find updated information to support and guide decisions around the use of generative AI in teaching and learning. UBC will mark Academic Integrity Week October 15-18 with a variety of events and engagement sessions on each campus. See the events section for more details.


Editorial

The flawed promise of generative AI writing detectors

Dr. Simon Bates, Vice-Provost and Associate Vice-President, Teaching and Learning (UBCV)
Heather Berringer, Associate Provost, Academic Operations and Services (UBCO)

Early reactions to widespread availability of generative AI writing tools such as ChatGPT included concerns over a low barrier route to misconduct in unsupervised assessments. “The College Essay Is Dead”, cried the headlines in the popular press [1]. Initial institutional and individual responses were mixed: some saw it as a golden opportunity to rethink and retool assessments to focus on process and not just on the final product. This, argued those with that perspective, was essential for a degree to stay relevant in an era when AI would soon be built directly into widely available suites of digital tools. The inverse reaction was to seek technology as impressive as the generative tools themselves to reliably and accurately detect AI-generated passages of text. Proponents of this approach argued it was essential to remain one step ahead of those looking to pass AI-generated text off as their own.

AI-generation detectors sprang up with bold claims about efficacy and utility, but no transparency about how they were trained, what dataset was used to evaluate them, or any ability to reproduce the results. These detection tools and the methods of use that wrap around them provided no student visibility into how their writing was being evaluated, and led UBC to not enable Turnitin’s AI-detection feature in April 2023 [2], a decision re-affirmed in August 2023 [3]. A year later, even more detector tools are available; in what follows, we summarize some recently-published research that illustrates how these tools perform under a systematic battery of tests that include human-generated, AI-generated, translated, and obfuscated texts.

Weber-Wulff et al [4] published a comprehensive study of 14 available tools each given five test cases of pre-written text of between 2000 and 10,000 characters. More detail follows, but:

  1. All AI-generated text detection tools evaluated have serious limitations, contrary to claims made by their developers / vendors.
  2. They too often present significant instances of false positives and false negatives, both of which carry serious repercussions for students.
  3. No tool accurately predicts the authorship of a text better than 80%; many struggle to reach 70%.
  4. It is easy to obfuscate systems with translation, text manipulation and paraphrasing and further reduce accuracy of prediction of all tools. 

One of the research challenges was mapping the various outputs to a consistent scale that could be used to compare across tools. Some tools output a percentage likelihood of human / AI authorship, while others use statements like “possibly AI generated.” To compare performance of different tools, researchers mapped outputs onto a five-point scale of assessment used as a basis for classification of accuracy and error type. As an example, text written by an AI tool in the test dataset should be detected as a ‘positive’ result if the detection tool is effective; in this case the actual performance of the detection tool is classified as one of 5 options: True Positive, Partially True Positive, Unclear, Partially False Negative, False Negative. Detection tools are evaluated as to their accuracy (by tool and by document type) and the rate at which they produce false positive (text is human-written but flagged as AI-written) and false negative (text is AI written but flagged as human written) errors. The figure below from the paper illustrates overall accuracy for each tool calculated as an average of all approaches discussed. It illustrates the third conclusion noted above: no tool performs with higher than 80% accuracy, with most well below that.

Widespread availability of generative AI tools presents real challenges to assessment design, but a quick fix detection tool is not the answer. They are not sufficiently accurate, the methodology of how they arrive at a conclusion is opaque, and they are too easily gamed by simple obfuscation techniques. Efforts should instead focus on preventative and educative approaches, including a rethink of assessment strategies and activities, and a renewed focus on evidencing the process of learning rather than simply assessing a final product. This is most certainly non-trivial and looks very different in different disciplinary contexts, but it is vitally important as we continue to build and maintain a culture of integrity at UBC.


[1] Quite literally, that was the title. “The College Essay is Dead” The Atlantic, December 2012 https://www.theatlantic.com/technology/archive/2022/12/chatgpt-ai-writing-college-student-essays/672371/ (retrieved July 30th 2024).

[2] “UBC not enabling Turnitin’s AI detection feature” April 2023 https://lthub.ubc.ca/2023/04/04/ubc-not-enabling-turnitins-ai-detection/ (retrieved July 30th 2024)

[3] “UBC affirms decision to not enable Turnitin’s AI detection feature” August 2023 https://lthub.ubc.ca/2023/08/28/ubc-affirms-decision-to-not-enable-turnitin-ai-detection/ (retrieved July 20th 2024)

[4] Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S. et al. Testing of detection tools for AI-generated text. Int J Educ Integr 19, 26 (2023). https://doi.org/10.1007/s40979-023-00146-z


Resources

  • UBC Guidance on Generative AI includes UBC’s principles, teaching and learning guidelines, and academic integrity guidelines.
  • Revised Generative AI FAQ contains information for instructors and students about how to use generative AI tools with integrity.
  • AI in Teaching and Learning supports educators with resources from the Centre for Teaching, Learning, and Technology (CTLT) at UBC Vancouver, and the Centre for Teaching and Learning (CTL) at UBC Okanagan.

Upcoming Events

UBC will celebrate Academic Integrity Week from October 15-18, 2024. Both campuses are offering events that raise awareness and encourage conversation about academic integrity. This week will also celebrate students in their efforts to learn with integrity in their academic work.

More information about the week’s events, and registration links, can be found on our Academic Integrity Week page.

UBC’s Academic Integrity Week coincides with the International Center for Academic Integrity’s (ICAI) International Day of Action for Academic Integrity. This year’s theme is ‘All hands on deck: Making academic integrity everyone’s job’.

More information can be found on the ICAI’s International Day of Action for Academic Integrity page.

Join colleagues from across the province at Kwantlen Polytechnic University for Academic Integrity Day featuring keynote speaker Dr. Alana Abramson. See BC Academic Integrity Day for information and registration. Click here to download the full schedule for the day. 

10am – 4 pm | In person conference at Kwantlen Polytechnic University