Key Performance Indicators
Every platform in the Global 100 is scored on these twelve metrics. Each KPI has a published definition, measurement method, and score range. Weights vary by category. see Criteria & Weights for the full matrix.
The percentage of correctly classified content samples across a standardized test corpus of 10,000 documents comprising human-written, AI-generated, and mixed-origin content. Measured using balanced accuracy to account for class distribution.
Blind testing against Global 100 reference corpus v3.1
0 to 100
The percentage of human-written content incorrectly flagged as AI-generated. A critical metric for real-world deployment where false accusations carry reputational and legal risk.
Inverse measurement from accuracy testing on human-only subset
0 to 100 (lower is better)
The breadth of AI generation models the tool can reliably detect. Scored based on successful detection across GPT-4, Claude 3, Gemini, Llama 3, Mistral, and emerging models.
Per-model accuracy testing across 12 major generation models
0 to 100
How frequently detection models are retrained to address new AI generation methods. Measured as average days between model updates over the trailing 12 months.
Changelog analysis and version history tracking
0 to 100
Number of languages supported with verified detection accuracy above 85%. Evaluates both the breadth of language coverage and per-language accuracy.
Multi-language test corpus evaluation across 30 languages
0 to 100
Whether the platform publicly discloses its detection methodology, publishes research papers, or provides open-source components. Scored across five transparency dimensions.
Structured audit of public documentation and publications
0 to 100
Whether the platform has undergone independent third-party accuracy auditing. Scored based on recency, scope, and auditor reputation.
Verification of audit reports and auditor credentials
0 to 100
Published peer-reviewed papers, open datasets, benchmark contributions, and participation in industry standards bodies (C2PA, NIST FATE).
Publication and contribution audit
0 to 100
Processing latency for single-document analysis and maximum batch throughput. Measured in documents per second under standard load conditions.
Automated benchmarking under controlled network conditions
0 to 100
Quality and completeness of developer API: documentation standards, rate limits, authentication methods, SDK availability, and webhook support.
Developer experience audit using standardized rubric
0 to 100
Adherence to GDPR, CCPA, and relevant data protection regulations. Evaluates data retention policies, encryption standards, and user consent mechanisms.
Policy review and technical verification
0 to 100
Availability of free tier, fairness of pricing structure, and provision of educational or nonprofit discounts. Evaluates whether the tool is accessible beyond enterprise buyers.
Pricing analysis and tier comparison
0 to 100
- Mitchell et al. (2023), DetectGPT. foundational research on zero-shot machine-generated text detection using probability curvature. Informs our perplexity and burstiness measurement methods.
- Stanford HAI (2023), AI detector bias against non-native English writers. documented false positive bias in commercial AI detectors. Informs our False Positive Rate KPI methodology and ESL test corpus.
- Google DeepMind, SynthID. technical documentation of watermark-based detection at generation time. Informs our Watermark Robustness scoring.
- C2PA Specification 2.1. the open content authentication standard. Defines what we measure in the Transparency Score and Content Authentication categories.
- NIST AI Risk Management Framework. the framework that informs our Privacy Compliance and Independent Audit Status KPIs.