What is perplexity and burstiness in AI detection? Explained simply

In this article

What perplexity measures
What burstiness measures
How they look in real text
Why human essays sometimes score low
How to improve your scores

When you see scores like "Perplexity: 78" and "Burstiness: 85" in SafeGrade's writing analysis, what do they actually mean? And why does AI writing consistently score lower on both?

These two signals are the core of most AI detection systems — including Turnitin's. Understanding them doesn't just help you interpret your SafeGrade results — it helps you understand what authentic academic writing looks like from a statistical perspective.

Signal 1

Perplexity

Plain English definition: Perplexity measures how surprising or unpredictable the word choices in a piece of writing are. A high perplexity score means the text frequently uses unexpected words. A low perplexity score means the word choices are highly predictable — each word is statistically "obvious" given what came before it.

Why AI scores low: Large language models work by predicting the most likely next token (word or word-fragment) given the context. Because they're trained to produce fluent, coherent text, they tend to choose statistically common, expected words. This makes AI writing smooth and readable — but also statistically predictable. Detection tools measure this predictability and flag it.

Why humans score high: Human writers make surprising choices — they use unusual vocabulary, unexpected metaphors, specific concrete examples, and personal turns of phrase. A student writing about sociology might use the specific word "habitus" rather than "social disposition." A student writing about law might use "tortfeasor" rather than "person who caused the harm." These choices are unpredictable to a model trained on general text, which raises the perplexity score.

Signal 2

Burstiness

Plain English definition: Burstiness measures how much sentence lengths vary throughout a piece of writing. High burstiness means the text has dramatic variation — short punchy sentences followed by long complex ones. Low burstiness means all sentences are roughly the same length — uniform, consistent, regular.

Why AI scores low: AI models produce text with remarkably consistent sentence structure. If you ask ChatGPT to write an essay, it tends to produce sentences of similar length and complexity throughout — because that's what the training data rewarded as "good writing." The result is text that flows well but has an almost metronomic regularity that human writers don't have.

Why humans score high: Natural human writing is rhythmically irregular. Writers speed up and slow down. They use one-word sentences. Then they use longer, more elaborated sentences that develop a point through multiple clauses, building toward a conclusion that brings the argument together. That variation — the burst pattern — is what burstiness captures.

How they look in real text

Burstiness — AI vs human

✗ Low burstiness (AI-typical)

"Social capital plays an important role in educational outcomes. Research has shown that students from higher social classes tend to achieve better results. This is due to the resources and networks available to them. These factors contribute to the reproduction of inequality in education. Bourdieu's framework helps explain this phenomenon clearly."

Five sentences. All roughly the same length. All simple subject-verb-object structure. No rhythm variation at all.

✓ High burstiness (human-typical)

"Social capital matters. But its role in educational outcomes is more complex than simply having the right connections — it operates through what Bourdieu (1986) calls the conversion of cultural capital into academic success, a process that is neither automatic nor inevitable. The working-class student who knows the rules of the academic game can play it. Most don't."

Dramatic sentence length variation. Short opening, long middle sentence, medium, then a two-word punch. That's human rhythm.

See your perplexity and burstiness scores.

SafeGrade measures all 6 writing dimensions on every scan. Free and unlimited.

Scan my essay →

Why human essays sometimes score low

This is important and often missed in discussions of AI detection. Some human writers naturally produce low perplexity and burstiness scores — not because they used AI, but because:

Highly formulaic writing styles — nursing reflections using the Gibbs model, law problem questions using the IRAC structure, scientific lab reports — all have prescribed structures that reduce natural variation
Non-native English speakers — students writing in their second or third language tend to use simpler, more predictable vocabulary and more uniform sentence structures
Over-reliance on academic hedging — if you've been taught to qualify every claim, you might naturally produce a lot of "it could be argued that" and "this suggests that" constructions, which are both predictable and uniform
Subject conventions — some disciplines (law, sciences) have very standardised writing conventions that limit vocabulary and structural variation by design

What this means in practice

A low perplexity or burstiness score is a signal, not a verdict. SafeGrade shows you where your writing sits across all six dimensions so you can make an informed decision about what to address. If you wrote the essay yourself and still have a low score, the Writing Coach can help you identify specific passages to vary before submission.

How to improve your scores

↑

Improve perplexity by using specific, subject-appropriate vocabulary and concrete examples instead of general descriptors

↑

Improve burstiness by deliberately varying sentence length — include some very short sentences alongside longer analytical ones

The single most effective thing you can do for both scores is to write more specifically. Instead of "research shows that education is affected by social factors," write "Coleman's (1966) landmark study found that family background explained more variance in educational achievement than school resources." Specific claims, specific evidence, specific vocabulary — all of these raise perplexity naturally.

For burstiness, read your essay aloud. If every sentence takes the same amount of time to read, that's a problem. Try to end each paragraph with either a very short, punchy sentence or a longer elaborated one — not another medium-length one.

SafeGrade's Writing Coach can analyse specific paragraphs and suggest how to increase sentence variety while keeping your argument intact. Ask it: "This paragraph all has similar sentence length. Can you help me vary the rhythm without changing the argument?"

Check all six writing
dimensions on your essay.

Perplexity, burstiness, vocabulary diversity, phrase patterns, sentence variation, paragraph structure — all measured free on every scan. No account needed for the first scan.

Scan my essay free →

What is perplexity andburstiness in AI detection?Explained simply.

How they look in real text

Burstiness — AI vs human

Why human essays sometimes score low

How to improve your scores

What is perplexity and
burstiness in AI detection?
Explained simply.