AI Detection Explained

How Does an AI Detector Work?

AI detectors are probabilistic classifiers that analyze statistical patterns in text — not semantic meaning. They look for signals like predictability, structural uniformity, and Unicode anomalies to estimate the likelihood that a language model generated the content. Understanding their methodology reveals both their power and their significant limitations.

Perplexity

Measures how predictable each word choice is

Burstiness

Measures variation in sentence length and complexity

Unicode scanning

Detects hidden characters and invisible markers

The Core Concept: Language Models and Probability

To understand how AI detectors work, you first need to understand how AI language models generate text. Models like GPT-4, Claude, and Gemini do not "think" in the way humans do. They predict the next most likely token (roughly, a word or word fragment) given everything that came before it. At each step, they consult a probability distribution over their entire vocabulary and select from the top candidates.

This means AI-generated text has a distinctive statistical signature: it tends to be very predictable. The model consistently chooses high-probability tokens. A human writer, by contrast, makes more idiosyncratic choices — selecting words that are contextually plausible but not necessarily the single most predictable option. Human writing has more statistical entropy.

AI detectors exploit this difference. By measuring how "surprising" or "expected" each word choice is — using the same probability frameworks that underlie language models — they generate a score that indicates how likely it is that a model produced the text.

Perplexity: The Primary Detection Signal

Perplexity is a mathematical measure of how well a probability model predicts a sequence. In natural language processing, low perplexity means the language model found the text highly predictable. High perplexity means the text was surprising — the model would not have predicted those specific word choices.

AI-generated text consistently shows lower perplexity than human-written text when scored by a language model. This is the fundamental signal most AI detectors use. The detector runs the text through a reference language model, calculates the perplexity score at each token position, and uses the resulting distribution to classify the text.

How perplexity scoring works in practice

The detector feeds your text token-by-token into a reference language model.
At each position, it asks: how surprised was the model by this token?
It calculates a running perplexity score across the full text.
Low average perplexity = likely AI. High average perplexity = likely human.
The score is mapped to a probability estimate and displayed as a percentage.

The challenge is that perplexity alone is not perfectly discriminative. Formal writing, technical documentation, legal text, and academic prose tend to have lower perplexity than casual writing — simply because they follow predictable conventions. This causes false positives for human writers who write in structured, formal styles.

Burstiness: The Second Key Signal

Burstiness measures the variability of sentence structure and length within a piece of text. Human writing is characteristically "bursty" — humans naturally mix short punchy sentences with long complex ones, vary their syntax, interrupt themselves, use fragments, and shift rhythm in ways that reflect natural thought patterns.

AI-generated text tends to be much more uniform. Models often produce sentences of similar length, maintain consistent grammatical complexity throughout, and rarely produce the kind of stylistic interruptions or informal asides that characterize human writing. This structural uniformity is detectable statistically.

Human writing burstiness

Wide range of sentence lengths (5 to 40+ words)
Irregular syntax patterns
Fragments and parenthetical asides
Rhythm shifts between sections
Occasional run-ons or mid-thought pivots

AI writing burstiness

Narrow sentence length range (15–25 words typical)
Consistent grammatical structure
Rarely fragments; rarely run-ons
Uniform rhythm throughout
Predictable paragraph structure

AI detectors combine burstiness and perplexity scores to create a composite classification. Neither signal alone is reliable enough, but together they achieve better accuracy than either in isolation.

Unicode and Hidden Character Scanning

A third detection method that is less discussed but increasingly important is Unicode character analysis. AI systems, when generating text, sometimes produce non-standard Unicode characters that do not appear in typical human-written content. These include:

Zero-width characters

U+200B (zero-width space), U+200C (zero-width non-joiner), U+200D (zero-width joiner). These are invisible in rendered text but present in the raw string. Their presence in text generated by AI systems is a detectable anomaly.

Non-standard punctuation

AI models often produce Unicode em dashes (U+2014), en dashes (U+2013), curly quotes, and other typographic characters that differ from the ASCII equivalents a human might type on a standard keyboard.

Byte-order marks

Some AI output pipelines insert byte-order marks (U+FEFF) at the beginning of text or between sections. These are invisible and harmless in most rendering contexts but detectable in raw text analysis.

Soft hyphens and formatting characters

Soft hyphens (U+00AD) and other formatting-related control characters sometimes appear in AI output as artifacts of how models handle long words and line-breaking during generation.

Some detector tools scan for these Unicode patterns as a secondary signal. The Invisible Character Detector tool shows you exactly which of these characters are present in any text you paste — useful for understanding whether your AI-generated content carries these artifacts before publication.

Classifier Models: The Machine Learning Layer

More sophisticated AI detectors use classifier models — neural networks trained on large datasets of known human and AI text. These classifiers learn features that distinguish the two beyond simple perplexity and burstiness metrics. They can detect patterns in argument structure, topic transition styles, specific phrase patterns common to models, and subtle vocabulary preferences.

The most advanced detectors combine multiple approaches: watermark detection (for models that implement cryptographic watermarking in their sampling process), perplexity scoring, burstiness analysis, Unicode scanning, and trained classifier models. The outputs of these components are weighted and aggregated into a final confidence score.

OpenAI has researched statistical watermarking methods that would embed an imperceptible signal into AI-generated text during the token sampling process — by systematically biasing token selection according to a secret key. This would make AI-generated text detectable only to those who know the key. As of this writing, this approach is not deployed publicly, but it represents the direction the field is moving.

Why AI Detectors Get It Wrong So Often

AI detectors are probabilistic classifiers with significant error rates. Multiple independent studies have documented false positive rates above 10% for some detectors — meaning more than one in ten human-written texts is incorrectly flagged as AI. Understanding the failure modes is critical before relying on any detector output.

Non-native English speakers

Research published in 2023 and 2024 showed that text written by non-native English speakers is flagged as AI at dramatically higher rates. This is because formal, careful non-native writing has lower perplexity than casual native writing — the same statistical profile as AI text.

Formal academic writing

Academic prose, legal writing, and technical documentation follow very predictable conventions. Their sentence structures, vocabulary, and argument patterns create low-perplexity text that detectors misclassify as AI.

Model updates outpace detectors

Each new AI model generation produces text with different statistical signatures. Detectors trained on GPT-3 output may miss GPT-4 text, and vice versa. The arms race between generation and detection is ongoing.

Paraphrasing and editing

Any editing of AI text — rephrasing sentences, changing word choices, restructuring paragraphs — increases perplexity and burstiness, making the text look more human. Even modest editing significantly reduces detector confidence.

What AI Detectors Cannot Do

It is equally important to understand the hard limits of what current AI detectors cannot do. These boundaries define how much weight you should give to any detector output.

Detectors cannot tell you which AI model wrote the text. They can only estimate whether AI was involved, not whether it was GPT-4, Claude, Gemini, or another model.
Detectors cannot determine how much of a text is AI. If a document is 30% AI and 70% human, detectors will give a blended score that is difficult to interpret.
Detectors cannot verify intent. A human who writes in a very structured, formal style will score similarly to AI. The detector measures patterns, not provenance.
Detectors cannot handle short texts reliably. Most detectors require at least 250 words to produce meaningful scores. Very short texts produce unreliable classifications.
Detectors cannot guarantee accuracy. No current detector claims 100% accuracy or zero false positives. All outputs should be treated as probabilistic estimates, not definitive verdicts.

How to Interpret Your AI Detector Score

If you use an AI detector and receive a result, here is how to interpret it responsibly. A high AI probability score does not prove AI authorship. A low score does not prove human authorship. Both are probabilistic estimates with known error rates.

High scores (above 80%) on clearly human-written text usually indicate one of a few things: the text is written in a very formal or predictable style, the author is a non-native English speaker writing carefully, or the text covers a highly structured topic that generates consistent output. In these cases, editing to increase variety, adding personal anecdotes, or restructuring sentences can change the score significantly.

For purely AI-generated text, scores vary depending on the model, the prompt, and the temperature setting. Text generated at lower temperatures (more deterministic) scores higher for AI. Text generated at higher temperatures (more random) scores lower. The GPT Cleanup Tools suite can help normalize AI text and remove artifact characters before you check detection scores.

The Future of AI Detection

The AI detection field is evolving rapidly. Three directions are likely to shape the next generation of detection tools. First, cryptographic watermarking embedded at the model level will make AI text definitively identifiable to authorized parties, without relying on statistical inference. Second, ensemble methods combining multiple detection approaches will improve overall accuracy. Third, provenance tracking — attaching verified authorship metadata to documents at creation time — may become a standard complement to detection.

For now, AI detectors are useful tools with significant limitations. Use them as one signal among many, not as a definitive verdict on authorship. When in doubt, focus on what detectors cannot fake: genuine expertise, personal experience, verified facts, and original insight.

Want to check how your text scores?

Use the AI Detector to analyze your content, and the Invisible Character Detector to find any hidden Unicode artifacts that could skew the results. Clean text gives you the most accurate detection picture.

GPTCLEANUP AI Blog