LibGuides: A Guide to AI for Gonzaga Faculty: AI Detectors

How AI Detectors Work

AI detectors are pattern-based AIs, just like the LLMs they are evaluating. They are trained on examples of AI-generated and human-generated text, and make their best probabilistic guess about whether a given piece of text is more like one than the other.

There are two main categories by which the writing is evaluated: perplexity and burstiness.

Perplexity is a measure of how surprising the text is, that is, how different it is from what is likely to be the next predicted word in a sequence. For example, if a sentence starts, "It's raining out, so take your . . .", the word "umbrella" rates as low perplexity, while the word "monkey" is high perplexity.

Burstiness is a measure of the variability of sentence length and structure. Like this paragraph. The sentence lengths vary; the structure does as well. That's burstiness.

In general, AI creates text with lower perplexity and burstiness than humans do. AI generators will therefore rate text with low perplexity and burstiness as more likely to be AI-created.

DO AI Detectors Work?

The short answer is: not very well.

Now that we know how AI detectors work, several problems are immediately obvious which can lead to both false positives and false negatives:

Humans can write with low perplexity and burstiness. In fact, humans are more likely to write with lower perplexity and burstiness when writing in formalized and graded contexts, like academic writing.
Because LLMs are trained on examples of human writing in the first place, these examples can have a higher similarity to what an AI outputs. This has led AI detectors to flag portions of the Bible or the U.S. Constitution as AI generated. The companies which create AI detectors train these specific behaviors out when they are made aware of them, but this just putting a bandaid on that specific example of the problem, not fixing the underlying issue.
Some LLM models allow users to adjust the "temperature" of the output. Higher temperature means the LLM is more likely to choose words from further down on the list of most probabilistically likely words to come next. Higher temperatures therefore result in higher perplexity.
Non-native English writers are more likely to write with lower complexity, and therefore lower perplexity, than native English speakers. This can lead to false positives biased against ESL students.
AI is getting better at burstiness. The large companies continually train and refine new LLM models, using new corpora and new training methods, and the results tend to become less uniformly "AI-like." AI detectors are in a constant arms race, training their models to adjust to new LLM models.
A student who understands how an AI detector works can fairly easily generate text that defeats it, by simply prompting the AI to generate responses in a specified style.

Learn more:

Why AI writing detectors don’t work - Benj Edwards, Ars Technica

OpenAI confirms that AI writing detectors don’t work - Benj Edwards, Ars Technica

We tested a new ChatGPT-detector for teachers. It flagged an innocent student - Geoffrey A. Fowler,The Washington Post

AI-Detectors Biased Against Non-Native English Writers - Andrew Myers, Stanford Institute for Human-Centered AI

Can I Use AI Detectors?

The short answer is: yes, but with caution. An AI detector can be another piece of data in a larger picture when a student is suspected of using AI, but should never be the sole piece of data relied upon. An AI detectors cannot prove AI use, so treat the AI detector as a diagnostic tool, not a decision maker.

Learn more:

Be Your Own Best AI Detector - Justin Marquis, Gonzaga IDD