Skip to content
← Back to Insights
Briefing22 March 2026

What "rights-aware ingestion" actually means

The six audit areas, the 95% Rights Payload threshold, and the platform engineering changes required to meet certification. A three-page briefing.

What "rights-aware ingestion" actually means

Executive Briefing 03 · 3 pages · Updated March 2026

Definition

"Rights-aware ingestion" means that every piece of content entering your AI training pipeline has its rights status documented, verified, and respected before ingestion. It is not a policy statement — it is an engineering requirement with six measurable audit areas.

The six audit areas

  • Source identification: Every training item has a documented source with rights-holder attribution.
  • Licence verification: The legal basis for ingestion is documented — owned, licensed, public domain, or statutory exception.
  • TDM opt-out checking: Machine-readable opt-out signals are checked before ingestion. Content carrying opt-outs is excluded unless separately licensed.
  • CDR registration: Ingested content is cross-referenced with the CIP Rights Registry for active Core Data Records.
  • Consent expiry monitoring: Time-limited consents are tracked. Content is removed from active training when consent expires.
  • Audit trail: All ingestion decisions are logged with timestamps, decision rationale, and responsible person.

The 95% threshold

CIP Platform Certification Level 2 requires that 95% of your training corpus has documented rights coverage — meaning a CDR record, a valid licence, or a verified public-domain determination. The remaining 5% must have an active remediation plan with named deadlines.

This is not aspirational. Platforms that cannot demonstrate 95% coverage cannot certify at Level 2. The threshold reflects the operational reality that legacy content may take time to audit, but the vast majority of a responsible operator's corpus should be documented.