In this article
You wrote your essay. You spent days on it — reading sources, outlining, drafting, revising. You submit it. Then your professor tells you Turnitin flagged it as AI-generated.
If English is not your first language, this is not a rare edge case. It is a structural problem with how AI detectors work, and it is now being challenged in federal courts.
This guide explains why ESL writing gets flagged, what the research shows, what courts have ruled so far, and — most importantly — exactly how to build your authorship defense before you ever hit "submit."
Why AI detectors flag ESL writing
AI detectors do not actually detect AI. They measure two statistical properties of text:
- Perplexity — how predictable each word choice is. Low perplexity means the text follows common patterns.
- Burstiness — how much sentence length and structure vary. Low burstiness means the sentences are uniform.
Large language models produce low-perplexity, low-burstiness text because they are designed to pick the most probable next word and maintain consistent output. When a detector sees text with those properties, it flags it.
Here is the problem: non-native English speakers also produce low-perplexity text. If you are still building vocabulary, you use safer word choices. If you studied grammar from textbooks, you write in standard patterns. If you translate from your native language, the result tends to be structurally uniform. To a detector, your careful, correct English looks the same as machine output.
This is not speculation. A peer-reviewed study at Stanford tested it directly.
The numbers: 61% false positive rate
In 2023, researchers at Stanford (Liang, Zou, et al.) published a study in Patterns (Cell Press) titled "GPT detectors are biased against non-native English writers". They ran 91 human-written ESL essays (from TOEFL preparation materials) and 88 essays written by US 8th graders through seven commercial AI detectors.
The results:
- 61.22% of the human-written ESL essays were incorrectly flagged as AI-generated
- 97.8% of those essays were flagged by at least one of the seven detectors
- 19.8% were flagged unanimously by all seven detectors
- The same detectors correctly identified the native English essays with near-perfect accuracy
To prove causation, the researchers ran a follow-up test. They took verified-human native English essays and asked ChatGPT to "simplify" the language to sound non-native. The false-positive rate spiked. They took the ESL essays and asked ChatGPT to "enhance" them to sound native. The false-positive rate dropped. The detectors were not finding AI. They were punishing linguistic simplicity.
A 2026 study by Hadra, Cambridge, and Mesbah in the International Journal for Educational Integrity (Springer) found that Turnitin achieved only 61% overall accuracy, calling the trade-off between detection power and false accusations a "structural mathematical limit, not an engineering flaw that can be patched."
What courts are saying
The legal landscape is shifting. Multiple lawsuits in 2025-2026 are testing whether schools can discipline students based on AI detector scores.
The Palo Alto case — pending federal litigation
A family in California filed a federal lawsuit (Doe et al v. Palo Alto Unified School District et al., docket 5:25-cv-04202) after their child's essay was flagged at 76% by Turnitin. The student's grade dropped from a high B/low A to a C. The family submitted a 1,162-page evidentiary packet — revision history, drafts, Google Docs timestamps — showing the work was written by hand over days. The district declined to restore the grade. The case is now pending in the U.S. District Court for the Northern District of California, with claims filed under Title VI (national origin discrimination), Title IX, and Due Process.
Newby v. Adelphi — the student won
In January 2026, the New York Supreme Court ruled in Matter of Newby v. Adelphi University. Orion Newby, a freshman with Level 2 Autism Spectrum Disorder enrolled in Adelphi's "Bridges Program," was flagged at 100% by Turnitin on a history paper. The university upheld a failing grade. His family spent over $100,000 in legal fees.
The judge ruled the university's decision was "without valid basis and devoid of reason" and ordered the penalty rescinded and the record fully expunged. This ruling establishes — at least at the state-court level in New York — that an AI detector score alone is not a valid basis for academic punishment.
Doe v. Yale — ESL and Title VI, pending
An Executive MBA student at Yale was suspended for a year after a final exam was flagged by GPTZero. The student, a non-native English speaker, filed suit in U.S. District Court for the District of Connecticut arguing that the detector's algorithm has implicit bias against ESL writers — the exact mechanism the Stanford study documented — and that using it to discipline a non-native speaker amounts to national-origin discrimination under Title VI. The case is pending.
The counter-example: Hingham
Not every case goes the student's way. In Massachusetts, a federal court denied a student's injunction after the school found AI-hallucinated citations in the student's work — including a non-existent book. The court reasoned that the school's case rested on independent evidence of cheating, not the detector alone.
The pattern across cases is clear: when the accusation rests solely on a detector score, courts are increasingly skeptical. When the detector is corroborated by other evidence, courts defer to educators. For ESL students doing their own work, the implication is straightforward: your strongest defense is preserved evidence of your writing process.
Universities dropping AI detection
While courts adjudicate, institutions are making their own calls.
The University of Waterloo formally discontinued Turnitin's AI detection in September 2025, stating that the tools are "unreliable" and "biased toward students whose first language is not English." Vanderbilt disabled the detector citing false positive risks and ESL bias. MIT published internal teaching guidance titled "AI Detectors Don't Work. Here's What to Do Instead." Curtin University in Australia announced it will disable Turnitin AI detection starting in 2026.
Several other institutions have quietly disabled or restricted AI detection — among them American University, Boston University, UC Berkeley, Colorado State, DePaul, Georgetown, Michigan State, NYU, and the University of Cape Town. Most did this without press releases.
Even Turnitin itself has shifted. In February 2026, CEO Chris Caren announced a pivot "from detection to transparency." The company's own documentation acknowledges an uncertainty band of roughly plus or minus 15 percentage points and states the AI indicator "should not be the sole basis for punitive action."
The 8-step authorship defense
The legal trends are encouraging. But they will not help you in the meeting next Tuesday. What matters in practice is the evidence you have before anyone questions your writing.
Before you write
1. Choose a writing tool that keeps history. Google Docs preserves a full version history by default — every edit, paste, deletion, and timestamp. So does Microsoft Word (with AutoSave on OneDrive), Notion, and most modern editors. If your tool does not keep revision history, switch to one that does. This is the foundation of any authorship defense.
2. Keep your native-language drafts. If you outline or draft in your first language before translating into English, keep that original. A paper trail showing "here is my outline in Mandarin, here is the English translation, here are four rounds of revision" is evidence that no detector can contradict.
3. Save research notes separately. Keep a running document (or folder) of your sources, quotes, notes, and reasoning. If someone asks how you arrived at a particular argument, you can show the path from source to draft.
While you write
4. Write in sessions, not in one sitting. Revision history that shows writing spread over multiple days — with breaks, revisions, deleted paragraphs, restructured sections — is the strongest evidence of human authorship. A document that appears fully formed in one session (even if you genuinely wrote it that way) is harder to defend.
5. Do not paste large blocks from external sources. If you draft sections in a separate document and paste them into your final paper, the version history shows a sudden appearance of fully-formed text. Draft directly in the submission document when possible, or at minimum keep both documents with their own version histories.
After submission — if you are flagged
6. Do not apologize or admit fault. The instinct when a professor asks "did you use AI?" is to over-explain. Lead with evidence instead. Calmly present your revision history, native-language drafts, research notes, and timestamps. Reference the Stanford study if you are an ESL writer — it is peer-reviewed and directly on point.
7. Ask specific procedural questions. Request in writing: which detector was used, what score triggered the flag, what your institution's formal procedure is for handling AI accusations, and what your appeal options are. The strongest cases against schools involve procedural due process violations — institutions that did not give students fair notice, a chance to present counter-evidence, or a clear appeal path.
8. Know your institution's policy. Read your school's academic integrity policy before any dispute arises. If it does not address AI specifically — or does not specify what evidence counts as a defense — that is an argument worth raising. Many institutions adopted detector tools faster than they updated their integrity codes.
What ESL writers are doing right now
The NEA (National Education Association) — representing 3 million educators — published Five Principles for AI in Education stating that "biased AI cheating detection applications have incorrectly flagged students for misconduct" and specifically naming "emergent multilingual learners" who "have been falsely accused." The American Federation of Teachers passed a resolution urging safeguards for individual rights in AI deployment.
At the state level, Idaho enacted SB 1227 — a statewide AI-in-education framework requiring human-centered oversight and prohibiting high-stakes automated decisions. California's AB 1159, which passed its first chamber, would prohibit ed-tech vendors from training models on student submissions. Maryland and Illinois have similar bills pending.
The direction is clear: the era of treating a detector score as a verdict is ending. But the transition is slow, and individual students bear the risk in the meantime.
How Diglot helps ESL students
Diglot is a bilingual writing workspace built for the exact workflow that gets ESL students flagged: thinking in one language, writing in another, and refining until the English is clean.
The Authorship Certificate tracks every edit, paste, and AI-assist on your document and signs the resulting event chain with a cryptographic key. The result is a public verification URL that anyone — your professor, your department, an appeals committee — can open in any browser to see a tamper-proof record of how the document was actually written. No Diglot account required to verify.
If you are an ESL student who drafts in your native language and publishes in English, the Certificate captures that entire journey: the first draft, the translation, every round of revision. When someone questions whether you wrote your own work, you do not need to argue. You send a link.
Read the companion articles for more context: AI detection lawsuits in 2026, why AI detectors misread non-native English, how to make your English writing sound natural, and what to do if a client flags your writing.
Start a free draft with Diglot
Frequently asked questions
Can I get expelled for a false AI detection?
Penalties vary by institution — from a warning to academic misconduct charges. However, courts are increasingly skeptical of punishments based solely on detector scores. In Newby v. Adelphi, a judge overturned the penalty entirely. Your best protection is preserved evidence of your writing process.
Are AI detectors accurate for ESL writing?
No. A peer-reviewed Stanford study found that seven commercial AI detectors incorrectly flagged 61.22% of human-written ESL essays as AI-generated. The bias is structural: detectors measure text predictability, and non-native English writing is naturally more predictable due to constrained vocabulary and standard grammar patterns.
Does using Grammarly or a paraphrasing tool make my essay look like AI?
Grammar and paraphrasing tools can reduce perplexity (word predictability) and burstiness (sentence variation) — the same signals detectors flag. Using them does not mean you cheated, but it can increase your false-positive risk. Keep records of your drafts before and after tool use.
What evidence do I need to prove I wrote my essay?
The strongest evidence includes: revision history with timestamps (Google Docs, Word, Notion), native-language drafts or outlines, research notes and source links, and intermediate draft versions showing the work evolving over multiple sessions.
My university still uses Turnitin. What should I do?
Write in a tool that preserves revision history. Keep your research notes. If you draft in your native language, save those drafts. Build the proof trail before you submit — not after a flag. Read the full legal landscape article for context on how universities are responding.
Sources
- Liang, W., Zou, J., et al. (2023). "GPT detectors are biased against non-native English writers." Patterns, Cell Press. arxiv.org/abs/2304.02819
- Hadra, Cambridge, Mesbah (2026). "AI Detectors Fail Diverse Student Populations." International Journal for Educational Integrity, Springer.
- Doe et al v. Palo Alto Unified School District et al., docket 5:25-cv-04202, N.D. Cal. Hoodline coverage
- Matter of Newby v. Adelphi University (2026). law.justia.com
- University of Waterloo. "Discontinuing AI detection in Turnitin"
- Vanderbilt University. "Guidance on AI Detection"
- NEA. "Five Principles for AI in Education" (2025)
- AFT. "Resolution on Artificial Intelligence"
This article references active litigation. Case statuses may change. Last verified: 2026-05-19.