New: Institutional Licensing, deploy across your district or college. Read the framework →
A aiessaydetector.ai

Model-specific · ChatGPT

ChatGPT detector with model-version fingerprinting.

Paste the text. We return an AI-likelihood score, a per-sentence heatmap, and, when the signal is strong enough, a best-match to a specific GPT version.

  • GPT-3.5, GPT-4, GPT-4o, GPT-4.5 match-reporting
  • Sentence-level highlighting
  • Free, 5,000 characters, no account

Run the detector Methodology

Loading detector…

Expected input: Paste text suspected of being ChatGPT-generated…

About the ChatGPT detector

ChatGPT is the model family that started the modern AI-detection problem, and its output is still the most common type people need to identify. This detector is tuned specifically for the GPT family. GPT-3.5, GPT-4, GPT-4o, and GPT-4.5, and when the signal is strong enough, it reports which version the text most closely resembles.

What makes ChatGPT text detectable

  • Low perplexity. Each next word is predictable given the previous ones. Human writers more often produce mild surprises.
  • Uniform burstiness. Sentences cluster tightly around a median length. Human prose varies more.
  • Register default. GPT outputs default to polite formal prose with balanced paragraphs and signposting ("firstly," "moreover," "in conclusion"). Recognizable even when the content is right.
  • Closing tics. Model-family-specific closers ("I hope this helps", "Let me know if…") that survive longer than users realize.

Version differences we look for

GPT-3.5 output still shows up in the wild, it's cheap, fast, and some students default to it without realizing which model they're using. Its tells are stronger: more repetition, tighter sentence-length distribution, more frequent reliance on specific transition words. GPT-4 is harder; GPT-4o is harder still. GPT-4.5 is the current hardest-to-detect member of the family, and our best-match reporting is least reliable on it, we surface that uncertainty in the result.

What trips this detector up

Three known failure modes: (1) ChatGPT output that's been heavily edited by a human, at some point there's enough human text to dilute the signal. (2) Prompts that specifically ask ChatGPT to mimic a particular voice ("write like Hemingway"); the stylistic tells shift. (3) Very short passages (under 100 words), there's not enough signal to reach confident conclusions. Our result UI reports confidence; treat low-confidence results as inconclusive, not as clearance.

Privacy

Pasted text is stored 30 days and deleted. We do not train on user submissions. Full policy on /privacy.

Version best-match

When signal is strong, we report the closest GPT version, useful for understanding what tool was used, not for adjudication.

Confidence reporting

Every result includes a confidence band. Low-confidence on short passages is surfaced, not hidden.

Tuned for GPT register

The classifier is weighted toward GPT-family patterns rather than the generic AI-detection baseline.

GPT-family performance

What we know after 1.4M ChatGPT-suspected scans.

0.96
AUC vs GPT-4o
Highest AUC of any single-model class in our test set.
GPT-3.5 → 4.5
Versions covered
Plus turbo and o-series, retrained within 14 days of each release.
1.2s
Median latency
Includes per-sentence heatmap rendering.
<3%
ESL false-positive
Audited on 12,000 non-native essays.

Frequently asked questions

Does this detect GPT-4o specifically?
Yes. GPT-4o has distinctive patterns the detector is tuned for. The confidence on version best-match is highest for GPT-3.5 and GPT-4, and degrades for GPT-4o and GPT-4.5 as those models produce more human-like output.
Can it tell ChatGPT from Claude?
The model-family classifier is a separate output from the AI-likelihood score. Cross-family classification is reliable on unedited output and degrades on heavily-edited or short text. Our /ai-detector/claude page covers the Claude side specifically.
Will this detect text from the ChatGPT app vs. the API?
The detector doesn't distinguish deployment surface, it sees the text. Both produce output from the same model families, so the score is the same.
Is this free?
Yes, anonymous use covers 5,000 characters at a time and 5 checks per day. A free account raises daily limits and unlocks sentence-level highlighting.

Ready to check passage?

Free up to 3,000 characters. No account required for a single check.

Run the detector