12 Performance Indicators

Key Performance Indicators

Every platform in the Global 100 is scored on these twelve metrics. Each KPI has a published definition, measurement method, and score range. Weights vary by category. see Criteria & Weights for the full matrix.

Accuracy & Precision

Does the platform produce correct, calibrated outputs?

32%

Definition

The percentage of correctly classified content samples across a standardized test corpus of 10,000 documents comprising human-written, AI-generated, and mixed-origin content. Measured using balanced accuracy to account for class distribution.

Measurement Method

Blind testing against Global 100 reference corpus v3.1

Score Range

0 to 100

Definition

The percentage of human-written content incorrectly flagged as AI-generated. A critical metric for real-world deployment where false accusations carry reputational and legal risk.

Measurement Method

Inverse measurement from accuracy testing on human-only subset

Score Range

0 to 100 (lower is better)

Coverage & Robustness

How broad is the platform’s detection surface?

22%

Definition

The breadth of AI generation models the tool can reliably detect. Scored based on successful detection across GPT-4, Claude 3, Gemini, Llama 3, Mistral, and emerging models.

Measurement Method

Per-model accuracy testing across 12 major generation models

Score Range

0 to 100

Definition

How frequently detection models are retrained to address new AI generation methods. Measured as average days between model updates over the trailing 12 months.

Measurement Method

Changelog analysis and version history tracking

Score Range

0 to 100

Definition

Number of languages supported with verified detection accuracy above 85%. Evaluates both the breadth of language coverage and per-language accuracy.

Measurement Method

Multi-language test corpus evaluation across 30 languages

Score Range

0 to 100

Transparency & Governance

Is the methodology open, audited, and accountable?

21%

Definition

Whether the platform publicly discloses its detection methodology, publishes research papers, or provides open-source components. Scored across five transparency dimensions.

Measurement Method

Structured audit of public documentation and publications

Score Range

0 to 100

Definition

Whether the platform has undergone independent third-party accuracy auditing. Scored based on recency, scope, and auditor reputation.

Measurement Method

Verification of audit reports and auditor credentials

Score Range

0 to 100

Definition

Published peer-reviewed papers, open datasets, benchmark contributions, and participation in industry standards bodies (C2PA, NIST FATE).

Measurement Method

Publication and contribution audit

Score Range

0 to 100

Enterprise & Privacy

Is it production-ready and respects user data?

25%

Definition

Processing latency for single-document analysis and maximum batch throughput. Measured in documents per second under standard load conditions.

Measurement Method

Automated benchmarking under controlled network conditions

Score Range

0 to 100

Definition

Quality and completeness of developer API: documentation standards, rate limits, authentication methods, SDK availability, and webhook support.

Measurement Method

Developer experience audit using standardized rubric

Score Range

0 to 100

Definition

Adherence to GDPR, CCPA, and relevant data protection regulations. Evaluates data retention policies, encryption standards, and user consent mechanisms.

Measurement Method

Policy review and technical verification

Score Range

0 to 100

Definition

Availability of free tier, fairness of pricing structure, and provision of educational or nonprofit discounts. Evaluates whether the tool is accessible beyond enterprise buyers.

Measurement Method

Pricing analysis and tier comparison

Score Range

0 to 100

Research & Standards Cited

Mitchell et al. (2023), DetectGPT. foundational research on zero-shot machine-generated text detection using probability curvature. Informs our perplexity and burstiness measurement methods.
Stanford HAI (2023), AI detector bias against non-native English writers. documented false positive bias in commercial AI detectors. Informs our False Positive Rate KPI methodology and ESL test corpus.
Google DeepMind, SynthID. technical documentation of watermark-based detection at generation time. Informs our Watermark Robustness scoring.
C2PA Specification 2.1. the open content authentication standard. Defines what we measure in the Transparency Score and Content Authentication categories.
NIST AI Risk Management Framework. the framework that informs our Privacy Compliance and Independent Audit Status KPIs.

Want to see how these combine?

Explore the per-category weight matrix

Criteria & Weights →