1. Abstract
Publish an advisory human-readable AI usage policy only when the site intentionally needs one.
ai.txt is a fragmented emerging convention. It can communicate human-readable AI crawling, training, attribution, restriction, and contact guidance, but it is not a standard access-control mechanism and absence should not be penalized.
2. Classification
- Check ID
- ai-txt-policy
- Check version
- 1.0.0
- Package path
- lib/checks/ai-txt-policy/versions/1.0.0
- Category
- AI Discoverability
- Subcategory
- Bot Access Control
- Check group
- Bot Policy
- Check group ID
- bot-policy
- Maturity
- Emerging recommendation
- Scope
- site
- Check weight
- 1
3. Input And Output Contracts
- Input
- [email protected]
- Output
- [email protected]
- Resources inspected
- /ai.txt
4. Scoring Semantics
| Step ID | Title | Weight | Description |
|---|---|---|---|
fetch-ai-txt | Fetch /ai.txt | 0.2 | Fetch the optional root ai.txt policy file. |
validate-transport | Validate transport | 0.2 | Validate status, content type, encoding, and size for the policy file. |
parse-policy | Parse ai.txt policy | 0.25 | Parse AI Visibility-style sections or classify non-standard text formats. |
validate-policy-content | Validate policy content | 0.35 | Validate required policy sections, useful guidance, contact/attribution evidence, and safety risks. |
5. Package Documentation
ai.txt Policy Check v1.0.0
Status
- Version:
1.0.0 - Check identifier:
ai-txt-policy - Input contract:
[email protected] - Output contract:
[email protected] - Scope: site
Abstract
This check validates an optional /ai.txt advisory policy file. Absence is not penalized because ai.txt is not an IETF, W3C, IANA, or major-vendor standard. When the file is present, this version performs deeper validation against the AI Visibility section model because it is the most concrete documented ai.txt convention found in the research pass.
Motivation
ai.txt has name recognition, but the market is fragmented. The strongest signals are self-published proposals: AI Visibility's sectioned policy file, aitxt.ing's Markdown/context file, and an academic DSL proposal. None has clear major-platform adoption.
The check therefore treats /ai.txt as an emerging, human-readable Bot Access Control policy surface. It can communicate intent around AI crawling, training, attribution, restrictions, contact, and content scope, but it does not enforce access and does not replace robots.txt, AI bot rules, Content Signal, TDMRep, RSL, or Web Bot Auth.
Normative Model
This version uses the AI Visibility ai.txt guidance as the primary validation model when /ai.txt is present:
- File path:
/ai.txt. - Preferred media type:
text/plain; charset=utf-8. - Public, unauthenticated, readable text.
- Required sections:
[identity][permissions][restrictions]- Recommended sections:
[attribution][contact][content-types]
The check also recognizes, but does not fully pass as AI Visibility compliant:
- aitxt.ing-style Markdown/context files with YAML frontmatter or links to
other ai.txt context files.
- Generic Markdown or plain text AI policy files.
- Research/proposal DSL content from the arXiv paper only as non-standard text
unless it also satisfies the practical section model above.
Applicability
The check is applicable only when /ai.txt is reachable with a non-404, non-410 response.
If /ai.txt is absent, the result is not_applicable with score 100. Absence should not create a warning or failure.
Pass Criteria
/ai.txtreturns HTTP 200.- The file is readable text and is not HTML boilerplate, binary content, or an
error page.
- The file follows the AI Visibility section model.
[identity],[permissions], and[restrictions]contain meaningful text.- Policy content includes concrete permission, restriction, and AI
training/model-use language.
[contact]contains a usable email address or URL.- The file does not expose secrets or prompt-injection style instructions.
Warning Criteria
- The file is present but uses a non-standard convention such as aitxt.ing-style
Markdown/context text.
- The file is text-compatible but not served as
text/plain. - Recommended sections are missing or empty.
- Permission, restriction, training, attribution, or contact language is vague
or missing.
- The file links to external
ai.txtcontext files. - The file is unusually large for an advisory policy document.
Failure Criteria
/ai.txtreturns a non-OK status other than 404 or 410.- The file is reachable but empty.
- The file appears binary or unreadable.
- The response is HTML boilerplate or an error document.
- An AI Visibility-style file is missing required section content.
- The file appears to expose secrets, tokens, credentials, or private keys.
- The file contains prompt-injection style instructions rather than policy
guidance.
Evidence Model
The result emits:
- Fetch status, content type, length, and path.
- Transport media type, line count, warnings, and capped raw excerpt.
- Detected convention:
ai-visibility,aitxt-ing,generic-markdown, or
unknown-text.
- Section names, missing required sections, and missing recommended sections.
- YAML frontmatter keys for aitxt.ing-style files.
- Markdown headings and discovered links.
- Same-origin and external
ai.txtcontext links. - Policy content signals for permissions, restrictions, training/model use,
attribution, and contact.
- Safety evidence for possible secrets and prompt-injection phrases.
Validation And Scoring Steps
| Step | Weight | Purpose |
|---|---|---|
fetch-ai-txt | 0.20 | Fetch /ai.txt. |
validate-transport | 0.20 | Validate status, content type, encoding, and size. |
parse-policy | 0.25 | Parse AI Visibility-style sections or classify non-standard text. |
validate-policy-content | 0.35 | Validate useful policy guidance and safety risks. |
Standard Behavior
For sites that intentionally publish ai.txt, use the AI Visibility section model:
[identity]
Site: Example
Owner: Example Inc.
[permissions]
AI search and user-requested summarization are allowed with attribution.
[restrictions]
Model training, dataset creation, and commercial scraping require permission.
[attribution]
Credit Example and link to the source URL.
[contact]
mailto:[email protected]
[content-types]
Applies to public editorial pages and documentation.Non-Standard And Real-World Behavior
This version records aitxt.ing-style Markdown/context files and generic Markdown policy text as non-standard advisory evidence. These formats may be useful for humans and some agents, but they do not satisfy the AI Visibility section model.
The arXiv DSL proposal is treated as research-level. The check does not attempt to enforce or execute DSL, XML, or prompt compliance logic.
Non-Goals And Limitations
- This check does not require any site to publish
/ai.txt. - This check does not treat
ai.txtas access control. - This check does not verify crawler compliance.
- This check does not decide legal enforceability.
- This check does not consume sibling outputs from
robots.txt, AI bot rules,
Content Signal, TDMRep, RSL, Web Bot Auth, or llms.txt.
- This check does not fetch external
ai.txtchains beyond recording links.
References
Source: lib/checks/ai-txt-policy/versions/1.0.0/docs.md
6. Version Changelog
ai-txt-policy v1.0.0 Changelog
Initial versioned package for ai-txt-policy.
Implements isolated runtime behavior for optional /ai.txt validation. Missing files are not applicable. Present files are deeply validated against the AI Visibility section model, with weaker warnings for aitxt.ing-style Markdown or generic advisory text.
Source: lib/checks/ai-txt-policy/versions/1.0.0/changelog.md