Glossary
Feature.
A measurable property of text that feeds into the classifier. Perplexity and burstiness are classical features.
Feature
In machine learning, a feature is a number (or short vector) derived from raw input. For text, classical features include perplexity, burstiness, n-gram frequencies, punctuation density, and vocabulary richness. Modern detectors combine classical features with learned features (embeddings) for better cross-model generalization.
Classical features still in use
Even in 2026, classical features remain part of production stacks because they're cheap to compute and interpretable. The most common: perplexity (predictability of next-word given a reference LM), burstiness (variance in sentence length), vocabulary richness (type-to-token ratio), punctuation density, and n-gram fingerprints (specific phrase distributions associated with model families). A modern detector typically combines five to ten classical features with one or two learned embedding features.
Why feature engineering didn't disappear
End-to-end neural classifiers were supposed to replace hand-engineered features. They mostly did, except for two cases: short passages (under 200 words, where neural classifiers are unreliable and classical features still carry signal) and audit explainability (where you need to point at a specific feature to justify a decision). Hybrid stacks remain the production norm because they cover those cases gracefully.
How Features Interact with Model Architecture
Features in AI detection systems operate within a hierarchical relationship to the underlying language model. Raw text inputs are first tokenized, then passed through embedding layers that convert discrete tokens into continuous vector representations. These embeddings feed into attention mechanisms that weight contextual relationships, and the outputs of these mechanisms become the features that classification layers evaluate. The perplexity score, for instance, is not a direct feature but rather a composite metric derived from token-level probability distributions across the entire sequence.
This architecture means features exist at multiple levels of abstraction. Surface-level features include token frequency, punctuation density, and sentence length variance. Mid-level features capture syntactic patterns such as dependency parse structures and part-of-speech transitions. High-level features encode semantic properties like coherence scores and topic consistency. Detection models typically combine features from all three levels, which explains why simple prompt injections that only modify surface features often fail to evade robust classifiers. The model's ability to cross-reference features across abstraction levels creates redundancy that improves detection reliability.
Edge Cases and Known Limitations
Feature-based detection encounters significant challenges when analyzing short texts, typically those under 100 words. Many statistical features require sufficient sample size to achieve meaningful signal strength. Burstiness calculations, for example, need multiple sentences to establish variance patterns, and lexical diversity metrics become unreliable when the denominator is small. Academic abstracts, social media posts, and short-answer exam responses therefore produce higher false positive rates because legitimate human writing in these formats may not exhibit the feature distributions the model was trained to recognize.
Domain-specific and multilingual texts present additional boundary conditions. Technical writing in fields like mathematics or computer science often displays low perplexity and high uniformity because precise terminology admits fewer synonyms and sentence structures follow rigid conventional patterns. Similarly, non-native English speakers frequently produce writing with feature profiles that diverge from the training distribution, which was predominantly based on fluent native speaker corpora. Detection systems may misclassify these texts unless the feature extraction pipeline includes domain adaptation layers or the training set explicitly incorporated diverse linguistic backgrounds and specialized vocabularies.