1. Abstract

Publish an advisory human-readable AI usage policy only when the site intentionally needs one.

ai.txt is a fragmented emerging convention. It can communicate human-readable AI crawling, training, attribution, restriction, and contact guidance, but it is not a standard access-control mechanism and absence should not be penalized.

2. Classification

Check ID: ai-txt-policy
Check version: 1.0.0
Package path: lib/checks/ai-txt-policy/versions/1.0.0
Category: AI Discoverability
Subcategory: Bot Access Control
Check group: Bot Policy
Check group ID: bot-policy
Maturity: Emerging recommendation
Scope: site
Check weight: 1

3. Input And Output Contracts

Input: [email protected]
Output: [email protected]
Resources inspected: /ai.txt

4. Scoring Semantics

Step ID	Title	Weight	Description
`fetch-ai-txt`	Fetch /ai.txt	`0.2`	Fetch the optional root ai.txt policy file.
`validate-transport`	Validate transport	`0.2`	Validate status, content type, encoding, and size for the policy file.
`parse-policy`	Parse ai.txt policy	`0.25`	Parse AI Visibility-style sections or classify non-standard text formats.
`validate-policy-content`	Validate policy content	`0.35`	Validate required policy sections, useful guidance, contact/attribution evidence, and safety risks.

5. Package Documentation

ai.txt Policy Check v1.0.0

Status

Version: 1.0.0
Check identifier: ai-txt-policy
Input contract: [email protected]
Output contract: [email protected]
Scope: site

Abstract

This check validates an optional /ai.txt advisory policy file. Absence is not penalized because ai.txt is not an IETF, W3C, IANA, or major-vendor standard. When the file is present, this version performs deeper validation against the AI Visibility section model because it is the most concrete documented ai.txt convention found in the research pass.

Motivation

ai.txt has name recognition, but the market is fragmented. The strongest signals are self-published proposals: AI Visibility's sectioned policy file, aitxt.ing's Markdown/context file, and an academic DSL proposal. None has clear major-platform adoption.

The check therefore treats /ai.txt as an emerging, human-readable Bot Access Control policy surface. It can communicate intent around AI crawling, training, attribution, restrictions, contact, and content scope, but it does not enforce access and does not replace robots.txt, AI bot rules, Content Signal, TDMRep, RSL, or Web Bot Auth.

Normative Model

This version uses the AI Visibility ai.txt guidance as the primary validation model when /ai.txt is present:

File path: /ai.txt.
Preferred media type: text/plain; charset=utf-8.
Public, unauthenticated, readable text.
Required sections:
[identity]
[permissions]
[restrictions]
Recommended sections:
[attribution]
[contact]
[content-types]

The check also recognizes, but does not fully pass as AI Visibility compliant:

aitxt.ing-style Markdown/context files with YAML frontmatter or links to

other ai.txt context files.

Generic Markdown or plain text AI policy files.
Research/proposal DSL content from the arXiv paper only as non-standard text

unless it also satisfies the practical section model above.

Applicability

The check is applicable only when /ai.txt is reachable with a non-404, non-410 response.

If /ai.txt is absent, the result is not_applicable with score 100. Absence should not create a warning or failure.

Pass Criteria

/ai.txt returns HTTP 200.
The file is readable text and is not HTML boilerplate, binary content, or an

error page.

The file follows the AI Visibility section model.
[identity], [permissions], and [restrictions] contain meaningful text.
Policy content includes concrete permission, restriction, and AI

training/model-use language.

[contact] contains a usable email address or URL.
The file does not expose secrets or prompt-injection style instructions.

Warning Criteria

The file is present but uses a non-standard convention such as aitxt.ing-style

Markdown/context text.

The file is text-compatible but not served as text/plain.
Recommended sections are missing or empty.
Permission, restriction, training, attribution, or contact language is vague

or missing.

The file links to external ai.txt context files.
The file is unusually large for an advisory policy document.

Failure Criteria

/ai.txt returns a non-OK status other than 404 or 410.
The file is reachable but empty.
The file appears binary or unreadable.
The response is HTML boilerplate or an error document.
An AI Visibility-style file is missing required section content.
The file appears to expose secrets, tokens, credentials, or private keys.
The file contains prompt-injection style instructions rather than policy

guidance.

Evidence Model

The result emits:

Fetch status, content type, length, and path.
Transport media type, line count, warnings, and capped raw excerpt.
Detected convention: ai-visibility, aitxt-ing, generic-markdown, or

unknown-text.

Section names, missing required sections, and missing recommended sections.
YAML frontmatter keys for aitxt.ing-style files.
Markdown headings and discovered links.
Same-origin and external ai.txt context links.
Policy content signals for permissions, restrictions, training/model use,

attribution, and contact.

Safety evidence for possible secrets and prompt-injection phrases.

Validation And Scoring Steps

Step	Weight	Purpose
`fetch-ai-txt`	0.20	Fetch `/ai.txt`.
`validate-transport`	0.20	Validate status, content type, encoding, and size.
`parse-policy`	0.25	Parse AI Visibility-style sections or classify non-standard text.
`validate-policy-content`	0.35	Validate useful policy guidance and safety risks.

Standard Behavior

For sites that intentionally publish ai.txt, use the AI Visibility section model:

[identity]
Site: Example
Owner: Example Inc.

[permissions]
AI search and user-requested summarization are allowed with attribution.

[restrictions]
Model training, dataset creation, and commercial scraping require permission.

[attribution]
Credit Example and link to the source URL.

[contact]
mailto:[email protected]

[content-types]
Applies to public editorial pages and documentation.

Non-Standard And Real-World Behavior

This version records aitxt.ing-style Markdown/context files and generic Markdown policy text as non-standard advisory evidence. These formats may be useful for humans and some agents, but they do not satisfy the AI Visibility section model.

The arXiv DSL proposal is treated as research-level. The check does not attempt to enforce or execute DSL, XML, or prompt compliance logic.

Non-Goals And Limitations

This check does not require any site to publish /ai.txt.
This check does not treat ai.txt as access control.
This check does not verify crawler compliance.
This check does not decide legal enforceability.
This check does not consume sibling outputs from robots.txt, AI bot rules,

Content Signal, TDMRep, RSL, Web Bot Auth, or llms.txt.

This check does not fetch external ai.txt chains beyond recording links.

References

Source: lib/checks/ai-txt-policy/versions/1.0.0/docs.md

6. Version Changelog

ai-txt-policy v1.0.0 Changelog

Initial versioned package for ai-txt-policy.

Implements isolated runtime behavior for optional /ai.txt validation. Missing files are not applicable. Present files are deeply validated against the AI Visibility section model, with weaker warnings for aitxt.ing-style Markdown or generic advisory text.

Source: lib/checks/ai-txt-policy/versions/1.0.0/changelog.md