When you see scores like "Perplexity: 78" and "Burstiness: 85" in SafeGrade's writing analysis, what do they actually mean? And why does AI writing consistently score lower on both?
These two signals are the core of most AI detection systems — including Turnitin's. Understanding them doesn't just help you interpret your SafeGrade results — it helps you understand what authentic academic writing looks like from a statistical perspective.
Plain English definition: Perplexity measures how surprising or unpredictable the word choices in a piece of writing are. A high perplexity score means the text frequently uses unexpected words. A low perplexity score means the word choices are highly predictable — each word is statistically "obvious" given what came before it.
Why AI scores low: Large language models work by predicting the most likely next token (word or word-fragment) given the context. Because they're trained to produce fluent, coherent text, they tend to choose statistically common, expected words. This makes AI writing smooth and readable — but also statistically predictable. Detection tools measure this predictability and flag it.
Why humans score high: Human writers make surprising choices — they use unusual vocabulary, unexpected metaphors, specific concrete examples, and personal turns of phrase. A student writing about sociology might use the specific word "habitus" rather than "social disposition." A student writing about law might use "tortfeasor" rather than "person who caused the harm." These choices are unpredictable to a model trained on general text, which raises the perplexity score.
Plain English definition: Burstiness measures how much sentence lengths vary throughout a piece of writing. High burstiness means the text has dramatic variation — short punchy sentences followed by long complex ones. Low burstiness means all sentences are roughly the same length — uniform, consistent, regular.
Why AI scores low: AI models produce text with remarkably consistent sentence structure. If you ask ChatGPT to write an essay, it tends to produce sentences of similar length and complexity throughout — because that's what the training data rewarded as "good writing." The result is text that flows well but has an almost metronomic regularity that human writers don't have.
Why humans score high: Natural human writing is rhythmically irregular. Writers speed up and slow down. They use one-word sentences. Then they use longer, more elaborated sentences that develop a point through multiple clauses, building toward a conclusion that brings the argument together. That variation — the burst pattern — is what burstiness captures.
How they look in real text
Burstiness — AI vs human
Why human essays sometimes score low
This is important and often missed in discussions of AI detection. Some human writers naturally produce low perplexity and burstiness scores — not because they used AI, but because:
- Highly formulaic writing styles — nursing reflections using the Gibbs model, law problem questions using the IRAC structure, scientific lab reports — all have prescribed structures that reduce natural variation
- Non-native English speakers — students writing in their second or third language tend to use simpler, more predictable vocabulary and more uniform sentence structures
- Over-reliance on academic hedging — if you've been taught to qualify every claim, you might naturally produce a lot of "it could be argued that" and "this suggests that" constructions, which are both predictable and uniform
- Subject conventions — some disciplines (law, sciences) have very standardised writing conventions that limit vocabulary and structural variation by design
A low perplexity or burstiness score is a signal, not a verdict. SafeGrade shows you where your writing sits across all six dimensions so you can make an informed decision about what to address. If you wrote the essay yourself and still have a low score, the Writing Coach can help you identify specific passages to vary before submission.
How to improve your scores
The single most effective thing you can do for both scores is to write more specifically. Instead of "research shows that education is affected by social factors," write "Coleman's (1966) landmark study found that family background explained more variance in educational achievement than school resources." Specific claims, specific evidence, specific vocabulary — all of these raise perplexity naturally.
For burstiness, read your essay aloud. If every sentence takes the same amount of time to read, that's a problem. Try to end each paragraph with either a very short, punchy sentence or a longer elaborated one — not another medium-length one.
SafeGrade's Writing Coach can analyse specific paragraphs and suggest how to increase sentence variety while keeping your argument intact. Ask it: "This paragraph all has similar sentence length. Can you help me vary the rhythm without changing the argument?"
dimensions on your essay.