The legal status of TDM opt-outs in machine-readable form.

The question

Both the UK Data (Use and Access) Act 2025 and EU AI Act Article 53 require AI training operations to honour rights holders' opt-outs from text and data mining. Both statutes require the opt-out to be expressed in a "machine-readable" form. Neither statute specifies what that form is. The question this paper addresses: does a TDM opt-out declared through the four-field combination in a published cip.md file meet the machine-readability requirement, and what evidentiary weight does it carry against an AI operator who ingested the content despite the declaration?

CIP's position

A TDM opt-out declared through CIP-TDM-Opt-Out: true plus CIP-TDM-Opt-Out-Scope plus CIP-TDM-Jurisdiction plus CIP-Training-Ingestion: Prohibited, served from a stable /.well-known/cip.md path on the rights holder's domain, meets the machine-readability requirement under both UK DUA Act 2025 and EU AI Act Article 53. The four-field combination produces a parseable, deterministic, jurisdictionally-scoped declaration that any conformant AI ingester can resolve. The serving path is stable, the syntax is documented, and the declaration is timestamped at HTTP fetch time.

CIP's position is contested on two distinct lines, which this paper addresses directly.

Counter-argument 1 — the standardisation objection

The argument: machine-readability requires a recognised standard (analogous to robots.txt or schema.org), and a CIP-defined format is not a recognised standard until adopted formally — by IETF, W3C, ISO, or equivalent. Until then, an AI operator is not obliged to parse it.

CIP's response. The argument confuses formal standardisation with operational machine-readability. Robots.txt was honoured as machine-readable for a decade before any formal standardisation effort began (RFC 9309 was issued in 2022, two decades after operational adoption). Schema.org was honoured as machine-readable from inception, despite never being formally standardised. The legal test the statutes apply is whether the opt-out is machine-readable — i.e., parseable by an automated system — not whether it sits in a formal-standards-track document. The cip.md format is parseable by any automated system that implements the documented syntax (the framework publishes the parser specification openly; reference implementations exist). The operational machine-readability requirement is met regardless of the standardisation status.

A separate point: the framework does not preclude formal standardisation. CIP has expressed willingness to pursue IETF standards-track submission for cip.md when the format reaches v4.0 stability. In the interim, framework maturity and operational adoption are the tests of machine-readability that the statutes actually apply.

Counter-argument 2 — the discoverability objection

The argument: an opt-out that an AI operator could only discover by parsing a CIP-format file at a non-canonical path is not "machine-readable" because the operator must first know the format exists to look for it.

CIP's response. The framework recommends three concurrent discoverability mechanisms, each independently sufficient. First, cip.md is served from /.well-known/cip.md by convention — the IETF-standardised location for site-wide policy declarations (RFC 8615). An AI operator looking for any policy declaration at the well-known location will find it. Second, the cip.md declaration also emits a parallel Content-Signal: header in the operator's robots.txt, derived from the CIP fields per the v3.1.1 derivation specification. An AI operator that respects robots.txt sees the derived signal even without parsing the underlying CIP file. Third, individual pages can carry a <meta name="cip-mixed-rights-block"> tag (per v3.4 Mixed-Rights Architecture) referencing per-page rights declarations, discoverable by any HTML parser. Across the three mechanisms, a TDM opt-out declared in cip.md is discoverable by any AI ingester operating with reasonable diligence.

Operational consequence — the evidentiary point

The framework's central evidentiary claim is that the cip.md declaration produces a timestamped, machine-readable, third-party-resolvable record of what the rights holder declared and when. When an AI operator subsequently asserts they did not know about the opt-out, the rights holder can produce server logs (or third-party archive evidence such as Wayback Machine captures) showing the declaration was published before the alleged ingestion. The framework's published reference parser specification means any neutral analyst can reproduce the parse and confirm the opt-out was operationally machine-readable. This is the evidentiary infrastructure that turns subsisting rights from an abstract legal entitlement into an enforceable position.

Conclusion

CIP's position is that a TDM opt-out declared through the four-field cip.md combination, served from /.well-known/cip.md with parallel Content-Signal: and per-page meta-tag discoverability, meets the machine-readability requirement under UK DUA Act 2025 and EU AI Act Article 53. The position is defensible against the standardisation and discoverability counter-arguments. The operational consequence is that AI operators who ingest content despite a properly-declared opt-out face an evidentiary baseline that the framework deliberately designed to be reproducible by any neutral analyst.