Top-level report checks
44The primary checks shown as report sections and listed in the public catalog.
- robots.txt
- llms.txt
- Structured data
- OpenAPI
- OAuth discovery
- MCP server card
- WebMCP
- Agent commerce
Check catalog
CanAgentUse audits the signals that determine whether AI search systems and agents can discover, understand, cite, authenticate with, and safely use a website or web application.
The catalog below lists 44 top-level report checks. The full 129-signal count includes those checks, 38 nested validation steps, and 47 performance, accessibility, SEO, and browser audit signals that appear inside scan reports.
The primary checks shown as report sections and listed in the public catalog.
Step-level checks that appear inside richer report sections when the scanner validates a resource or analyzes AI SEO readiness.
Browser-based technical audit signals in scan reports: category scores, grouped audit families, and focused direct audits.
These are the 44 primary checks users see in the report. Checks with richer validation expand into step-level findings inside the report, which is why the total signal count is higher than the visible catalog count.
Signals that help search systems, crawlers, and agents find canonical resources and machine-readable alternates.
Established
Publish /robots.txt with clear crawl rules.
robots.txt gives crawlers and agents a standard place to read crawl permissions, disallowed paths, and sitemap locations before requesting site content.
ReferenceEstablished
Publish a sitemap and reference it from robots.txt.
Sitemaps help crawlers and agents discover canonical URLs, updated pages, and deeper content that may not be obvious from the homepage navigation.
ReferenceEstablished
Include Link response headers for agent discovery using RFC 8288.
Link headers let automated clients discover API catalogs, documentation, and machine-readable alternates without parsing page markup first.
ReferenceEstablished
Expose schema.org JSON-LD so AI search systems can parse page entities.
JSON-LD is a low-friction structured data format that agents can extract without interpreting page presentation or microdata markup.
ReferenceEstablished
Identify the site owner and website entity in structured data.
Organization and WebSite schema help agents identify the publisher, canonical site identity, logo, and related profiles for attribution.
ReferenceEstablished
Use schema types that describe the page's primary content or offering.
High-value schema types tell agents whether a page is an article, product, event, service, or other actionable content type.
ReferenceEstablished
Expose question-and-answer content in FAQPage structured data when FAQs are present.
FAQPage schema lets agents extract visible question-and-answer content cleanly and avoid guessing which text is an answer.
ReferenceEstablished
Verify that IndexNow ownership key placement is detectable when the site advertises it.
IndexNow lets sites notify participating search engines about changed URLs, but ownership verification requires a UTF-8 key file whose filename matches the key.
ReferencePage structure, freshness, attribution, semantic HTML, and structured data signals that make content easier to extract and cite.
Established
Return HTML responses as markdown when agents request it.
Markdown negotiation gives agents a cleaner representation of page content while preserving normal HTML for browsers.
ReferenceEstablished
Expose readable page structure through semantic HTML and accessible controls.
Semantic HTML gives browsers, assistive technology, search systems, and agents reliable landmarks, headings, controls, form semantics, and image context.
ReferenceEstablished
Expose machine-readable page entities and relationships.
Structured data gives agents explicit entities, relationships, and page meaning that are harder to infer reliably from visual layout alone.
ReferenceEstablished
Identify content authors or publishers for trust and attribution.
Author attribution helps agents cite content responsibly, assess source credibility, and distinguish editorial pages from anonymous marketing copy.
ReferenceEstablished
Expose modified and published dates for freshness-aware retrieval and ranking.
Freshness signals help agents decide whether content is current enough to cite, summarize, or compare against newer sources.
ReferenceEmerging recommendation
Publish llms.txt to summarize agent-readable site guidance.
llms.txt gives language-model clients a concise, curated summary of important site context and links before they crawl broadly.
ReferenceAI SEO checks for generative engines, Google AI Overviews, answer engines, extraction-friendly structure, and trust signals.
Emerging recommendation
Make page content easy for AI answer engines to extract, cite, and attribute.
Generative engines favor pages with self-contained answer passages, clear entities, structured data, summaries, FAQ patterns, and crawler-accessible HTML that can be cited without extra interpretation.
ReferenceEmerging recommendation
Make page content eligible and useful for Google AI Overviews and AI Mode extraction.
AI Overviews depend on Google-search eligibility, useful visible content, consistent structured data, answer-first sections, trust signals, and preview controls that permit snippets.
ReferenceEmerging recommendation
Make page content easy for answer engines and assistants to answer from directly.
Answer engines need concise answers, question-led structure, entity clarity, visible evidence, and trust signals that can be extracted without relying on search-only metadata checks.
ReferenceCrawler policy and bot identity signals that clarify which AI systems can retrieve, index, train on, or use site content.
Established
Add User-agent rules for AI crawlers like GPTBot, Claude-Web, and others.
Explicit AI bot rules reduce ambiguity for crawler operators and make training, indexing, or retrieval access policy auditable.
ReferenceInformational
Declare AI content usage preferences with Content Signal in robots.txt.
Content Signal provides a machine-readable way to communicate AI usage preferences where participating crawlers look for policy.
ReferenceInformational
Advertise HTTP Message Signatures keys when this site operates signed bot clients or supports Web Bot Auth workflows.
Web Bot Auth discovery lets servers and clients find signing keys for bot identity workflows based on HTTP Message Signatures.
ReferenceMachine-readable API discovery checks for API catalogs, OpenAPI documents, and compact AI context endpoints.
Established
Publish an API catalog for automated API discovery using RFC 9727.
API catalogs help agents find service descriptions, documentation, and status resources without guessing API entry points.
ReferenceEstablished
Publish a valid OpenAPI or Swagger document for API discovery.
OpenAPI documents let agents understand available operations, schemas, authentication, and request formats before calling an API.
ReferenceEmerging recommendation
Expose a compact API context endpoint agents can fetch before deciding which public API or discovery resource to use.
A context endpoint gives agents a small, low-latency summary of product purpose, safe actions, and canonical machine-readable resources without scraping the whole site.
ReferenceAuthentication discovery checks for OAuth, OIDC, authorization-server metadata, protected-resource metadata, and JWKS surfaces.
Established
Publish OAuth/OIDC discovery metadata so agents can authenticate with your APIs.
OAuth and OIDC discovery let agents find authorization, token, and key endpoints programmatically instead of relying on human documentation.
ReferenceEmerging recommendation
Publish OAuth Protected Resource Metadata so agents can discover how to authenticate.
Protected Resource metadata tells agents which authorization servers protect an API and how to connect authentication challenges to the right resource.
ReferenceModel Context Protocol and WebMCP checks for server cards, manifests, browser annotations, and tool discovery.
Emerging recommendation
Publish an MCP Server Card for agent discovery.
MCP Server Cards help agents discover server transports, capabilities, and protocol details before opening an MCP session.
ReferenceEmerging recommendation
Expose a stable MCP server metadata document that points agents to the site's MCP endpoint.
MCP clients need trustworthy server metadata, protocol version, transport details, and capability hints before connecting to a remote MCP server.
ReferenceInformational
Publish a WebMCP manifest for declarative browser tool discovery when using the draft manifest convention.
A WebMCP manifest advertises browser-exposed tools declaratively so agents can understand available site actions before invoking them.
ReferenceInformational
Support WebMCP to expose site tools to AI agents via the browser.
WebMCP can expose page context and actions directly through the browser, giving agents safer structured hooks than screen scraping alone.
ReferenceAgent-facing manifest, directory, A2A card, and skills-index checks that expose capabilities before use.
Emerging recommendation
Publish an agent skills discovery index.
An Agent Skills index lets clients find task-specific SKILL.md documents that describe how to use site capabilities correctly.
ReferenceEmerging recommendation
Publish the singular Agent Web Protocol agent.json manifest without confusing it with the separate agents.json directory convention.
agent.json is an emerging machine-readable manifest for declaring what a website does, how agents authenticate, and which actions or protocols are available.
ReferenceInformational
Publish an agents.json directory for agent-facing capabilities and contacts when using this convention.
agents.json gives clients a simple directory of agent-facing capabilities and contacts when a site chooses to advertise them.
ReferenceEmerging recommendation
Publish an agent card so A2A-compatible clients can discover capabilities.
A2A Agent Cards let compatible clients discover agent skills, input and output modes, and the endpoint used to invoke those skills.
ReferenceAgent-commerce checks for payment and transaction metadata including x402, MPP, UCP, and ACP.
Emerging recommendation
Support x402 protocol for agent-native HTTP payments.
x402 metadata helps agents recognize payable resources and understand how to satisfy HTTP payment requirements programmatically.
ReferenceEmerging recommendation
Support MPP for agent-native HTTP payments.
MPP metadata tells agents which API operations require payment and how machine payment authorization should be handled.
ReferenceInformational
Enable content payments via Universal Commerce Protocol when this site has paid content or commerce surfaces.
Universal Commerce Protocol discovery helps agents identify paid content or commerce endpoints and the payment metadata needed to use them.
ReferenceInformational
Publish ACP discovery metadata when agents need to discover this site's commerce API.
ACP discovery gives commerce agents a structured way to find protocol endpoints, versions, and payment metadata before transacting.
ReferenceBrowser, transport, and security headers that improve reliability for users, crawlers, and automated clients.
Established
Require HTTPS for repeat visits with Strict-Transport-Security.
HSTS tells browsers to keep using HTTPS after the first secure visit, reducing downgrade and mixed-transport risk for repeat users.
ReferenceEstablished
Constrain script, style, frame, and resource loading with a Content-Security-Policy header.
Content Security Policy limits where scripts, styles, frames, and connections can load from, reducing the impact of injection bugs.
ReferenceEstablished
Prevent MIME sniffing for browser-loaded resources.
X-Content-Type-Options prevents browsers from treating mislabeled files as executable content, reducing content-sniffing attacks.
ReferenceEstablished
Prevent unwanted framing with X-Frame-Options or CSP frame-ancestors.
Frame protection blocks hostile sites from embedding pages in deceptive frames, reducing clickjacking risk.
ReferenceEstablished
Limit how much referrer data leaves the site.
Referrer-Policy controls how much URL context is sent to other origins, limiting accidental leakage of paths, queries, and identifiers.
ReferenceMachine-readable text/data mining and AI usage preference signals.
Emerging recommendation
Publish a machine-readable text and data mining reservation declaration when the site needs one.
TDMRep is a W3C Community Group protocol and IANA-registered well-known URI for declaring text and data mining reservation policy on applicable content.
ReferenceEmerging recommendation
Publish a human-readable AI usage policy when the site needs one.
ai.txt is an emerging, non-IANA convention for publishing AI usage, attribution, contact, and training guidance; it is advisory and not required for every site.
Reference