Check specification

structured-data 1.0.0

Structured data

Versioned package for the structured-data check.

Assessment Suite
2026.06.10
Maturity
Established
Category
AI Discoverability
Subcategory
Page Structure

1. Abstract

Expose machine-readable page entities and relationships through a recognized structured-data syntax.

Structured data gives agents explicit entities, relationships, and page meaning that are harder to infer reliably from visual layout alone.

2. Classification

Check ID
structured-data
Check version
1.0.0
Package path
lib/checks/structured-data/versions/1.0.0
Category
AI Discoverability
Subcategory
Content Readiness
Check group
Page Structure
Check group ID
page-structure
Maturity
Established
Scope
page
Check weight
1

3. Input And Output Contracts

Resources inspected
JSON-LD, Microdata, RDFa

4. Scoring Semantics

Step IDTitleWeightDescription
format-presenceRecognized structured data format0.3Detect JSON-LD, Microdata, or RDFa structured data in the page HTML.
parseabilityParseability0.25Validate that detected structured data can be parsed without fatal syntax errors.
entity-typingSchema.org entity typing0.25Confirm that detected structured data exposes typed entities rather than empty markup.
format-consistencyFormat consistency0.2Warn when multiple syntaxes are mixed and fail when the same entity conflicts across syntaxes.
page-relevant-schema-familyPage-relevant schema family0.2When visible page intent is specific enough, verify structured data includes a matching primary Schema.org family.
minimum-useful-schema-fieldsMinimum useful schema fields0.15Validate minimum fields for detected primary schema families without requiring unrelated schema types.
schema-policy-recommendationsSchema policy recommendations0.05Warn when detected primary schema misses search feature recommended fields.
supporting-schema-linkageSupporting schema linkage0.1Check whether primary schema is connected to supporting entities such as Organization, WebSite, BreadcrumbList, Offer, Person, or ImageObject.

5. Package Documentation

Structured Data Check v1.0.0

Status

Abstract

This check verifies whether a page exposes machine-readable structured data through JSON-LD, Microdata, or RDFa. It also performs the schema-family validation that used to live in separate long-tail schema checks.

The active structured-data check now covers 15 merged schema families:

  • faq-page-schema
  • breadcrumb-schema
  • article-schema
  • product-schema
  • software-application-schema
  • local-business-schema
  • event-schema
  • media-schema
  • qa-page-schema
  • discussion-forum-schema
  • profile-page-schema
  • course-schema
  • job-posting-schema
  • recipe-schema
  • service-schema

author-attribution and organization-website-schema remain separate top-level checks because they are site trust and publisher identity signals, not only page-family schema validation.

Motivation

Agents, search engines, and semantic parsers can infer some meaning from visible HTML, but structured data gives them explicit entities, relationships, identifiers, names, URLs, and content types. JSON-LD, Microdata, and RDFa are different syntaxes for expressing that machine-readable model. A robust page can use any of them, but mixed or contradictory implementations are harder to debug and can cause different consumers to extract different facts.

The long-tail schema checks were merged here because exposing every family as a standalone report check inflated the suite count and made schema disproportionately affect the overall score. Inside this check, family-specific schema validation still produces evidence, but only applies when the visible page intent or existing markup makes the family relevant.

Normative Model

The check recognizes:

  • JSON-LD blocks using script[type="application/ld+json"].
  • Microdata entities using itemscope, itemtype, and itemprop.
  • RDFa subjects using typeof, property, resource, and about.

The check treats JSON-LD as the preferred implementation format for new Schema.org markup because current search platform guidance recommends JSON-LD where possible. Microdata and RDFa remain accepted when they expose valid, typed, consistent entities.

Types are normalized by removing https://schema.org/ and schema: prefixes.

Applicability

The structured-data syntax checks apply to public HTML pages where machine-readable page meaning, identity, content type, commerce, editorial metadata, navigation, or local/business facts would help agents understand the page.

The merged schema-family checks are stricter. A family is applicable only when:

  • visible page text contains enough specific intent hints for that family, or
  • the page already declares one of that family's Schema.org types.

If neither condition is true, the family is recorded as not_applicable and does not affect the check score.

Schema Family Matrix

Merged familyApplicable when visible page intent includesExpected Schema.org typesPassFailWarning
FAQPage schemaFAQ or question/answer contentFAQPageApplicable family has FAQPage and useful fields such as mainEntity.FAQ intent is detected but no FAQPage family is present.Schema is present but minimum useful fields or supporting linkage are incomplete.
Breadcrumb schemaBreadcrumb or breadcrumb navigation textBreadcrumbListApplicable family has BreadcrumbList and useful fields such as itemListElement.Breadcrumb intent is detected but no BreadcrumbList is present.Schema is present but field/linkage quality is incomplete.
Article schemaArticle, blog, news, guide, report, author, byline, published, or updated signalsArticle, BlogPosting, NewsArticle, Report, ReviewApplicable family has a matching Article-family type and minimum fields such as headline, author, and datePublished.Editorial intent is detected but no matching Article-family type is present.Matching schema exists but minimum fields or supporting linkage are incomplete.
Product schemaProduct, price, pricing, buy, cart, checkout, SKU, stock, availability, or offer signalsProduct, ProductGroupApplicable family has Product schema and useful fields such as name, description, and offers.Commerce intent is detected but no Product/ProductGroup schema is present.Product schema exists but field/linkage quality is incomplete.
SoftwareApplication schemaSoftware, app, SaaS, platform, API, SDK, integration, download, or app-store signalsSoftwareApplication, WebApplication, MobileApplicationApplicable family has matching software schema and useful fields such as name, applicationCategory, and operatingSystem.Software intent is detected but no matching application schema is present.Matching schema exists but field/linkage quality is incomplete.
LocalBusiness schemaAddress, hours, directions, location, restaurant, store, clinic, office, or near-me signalsLocalBusinessApplicable family has LocalBusiness and useful fields such as name, address, and telephone.Local-business intent is detected but no LocalBusiness schema is present.Schema exists but field/linkage quality is incomplete.
Event schemaEvent, webinar, conference, ticket, venue, starts, schedule, or agenda signalsEvent, BroadcastEventApplicable family has Event schema and useful fields such as name, startDate, and location.Event intent is detected but no Event/BroadcastEvent schema is present.Event schema exists but field/linkage quality is incomplete.
Media schemaVideo, watch, transcript, duration, thumbnail, or upload-date signalsImageObject, VideoObject, ClipApplicable family has matching media schema and useful media fields.Media intent is detected but no matching media schema is present.Media schema exists but field/linkage quality is incomplete.
QAPage schemaQuestion/answer, asked, answered, or FAQ-like signalsQAPageApplicable family has QAPage and useful fields such as mainEntity.Q&A intent is detected but no QAPage schema is present.Schema exists but field/linkage quality is incomplete.
DiscussionForumPosting schemaThread, forum, discussion, reply, replies, or posts signalsDiscussionForumPostingApplicable family has discussion schema and useful fields such as headline, author, and datePublished.Discussion intent is detected but no DiscussionForumPosting schema is present.Schema exists but field/linkage quality is incomplete.
ProfilePage schemaProfile, author bio, about me, person, member, or team-member signalsProfilePageApplicable family has ProfilePage and useful fields such as mainEntity.Profile intent is detected but no ProfilePage schema is present.Schema exists but field/linkage quality is incomplete.
Course schemaCourse, lesson, curriculum, enroll, instructor, or syllabus signalsCourseApplicable family has Course and useful fields such as name, description, and provider.Course intent is detected but no Course schema is present.Course schema exists but field/linkage quality is incomplete.
JobPosting schemaJob, career, role, employment, salary, apply now, hiring, or job-location signalsJobPostingApplicable family has JobPosting and useful fields such as title, hiringOrganization, and jobLocation.Job intent is detected but no JobPosting schema is present.Job schema exists but field/linkage quality is incomplete.
Recipe schemaRecipe, ingredients, cook time, prep time, nutrition, or instructions signalsRecipeApplicable family has Recipe and useful fields such as name, recipeIngredient, and recipeInstructions.Recipe intent is detected but no Recipe schema is present.Recipe schema exists but field/linkage quality is incomplete.
Service schemaCommerce/service, price, pricing, offer, product, buy, cart, or checkout signals where a service is representedServiceApplicable family has Service and useful fields such as name and provider.Service/commerce intent is detected but no Service schema is present.Service schema exists but field/linkage quality is incomplete.

Step Results

The check emits seven validation steps:

  1. Recognized structured data format.
  • Weight: 0.30
  • Pass: JSON-LD, Microdata, or RDFa is present.
  • Fail: no recognized structured-data syntax is found.
  1. Parseability.
  • Weight: 0.25
  • Pass: detected structured data parses into entities.
  • Fail: detected structured data has fatal parse issues.
  • Not applicable: no structured-data syntax was found.
  1. Schema.org entity typing.
  • Weight: 0.25
  • Pass: at least one extracted entity has an explicit type.
  • Fail: structured data exists but no typed Schema.org entity is extracted.
  • Not applicable: no structured-data syntax was found.
  1. Format consistency.
  • Weight: 0.20
  • Pass: one syntax is used, or duplicated entities agree.
  • Warning: multiple syntaxes are mixed without detected same-entity conflicts.
  • Fail: duplicated entities conflict on id, name, or url.
  • Not applicable: no structured-data syntax was found.
  1. Page-relevant schema family.
  • Weight: 0.20
  • Pass: specific visible page intent has a matching merged schema family.
  • Fail: specific visible page intent is detected but no matching family is present.
  • Not applicable: no eligible page-family intent or existing merged-family schema is detected.
  1. Minimum useful schema fields.
  • Weight: 0.15
  • Pass: best matching family node has at least 75% of the minimum useful fields.
  • Warning: some minimum fields are present but coverage is below 75%.
  • Fail: a matching family node exists but none of its minimum useful fields are present.
  • Not applicable: no eligible primary or page-relevant family node is present.
  1. Supporting schema linkage.
  • Weight: 0.10
  • Pass: supporting entities or linked @id references are present.
  • Warning: primary/page-relevant schema exists without supporting linkage.
  • Not applicable: no eligible primary or page-relevant family node is present.

not_applicable and informational steps are excluded from step-score denominators.

Evidence Model

The result evidence includes:

  • formats: per-format presence, validity, count, errors, types, and extracted entity summaries.
  • formatsFound: detected syntaxes.
  • primaryFormat: one of json-ld, microdata, rdfa, mixed, or none.
  • mixedFormats: whether more than one syntax was found.
  • entityCount: number of extracted entity summaries.
  • schemaTypes: normalized Schema.org type names found across syntaxes.
  • entities: capped entity summaries with format provenance.
  • conflicts: same-entity property conflicts across syntaxes.
  • schemaFamilies: one row per merged schema family with id, title, status, applicable, intentMatched, expectedTypes, and presentTypes.

Each entity summary records:

  • format
  • source
  • types
  • id
  • name
  • url
  • selected extracted properties

Standard Behavior

JSON-LD is parsed from application/ld+json script blocks. The check flattens top-level arrays and @graph nodes and records @type, @id, name, headline, and url where available.

Microdata is parsed from itemscope entities. The check reads Schema.org types from itemtype, stable identifiers from itemid, and simple properties from descendant itemprop attributes.

RDFa is parsed from elements with typeof. The check reads types from typeof, stable identifiers from resource or about, and simple properties from descendant property attributes.

Non-Standard And Real-World Behavior

Real sites sometimes:

  • Use JSON-LD for site identity and Microdata for products inherited from ecommerce templates.
  • Leave old RDFa or Microdata fragments in templates after migrating to JSON-LD.
  • Duplicate Organization, Product, or BreadcrumbList entities in more than one syntax.
  • Emit property-only Microdata or RDFa fragments without a clear enclosing entity.
  • Use incomplete JSON-LD blocks generated by tag managers or CMS plugins.

This version allows mixed syntaxes when extracted entities are consistent. It warns because one primary syntax is easier to maintain. It fails only when the same apparent entity has conflicting id, name, or url values across syntaxes.

Non-Goals And Limitations

This check does not fully expand JSON-LD contexts, execute JSON-LD framing, or validate every Schema.org property range. It performs pragmatic page-family diagnostics for agent-readiness scoring, not a full Google rich-result eligibility audit.

References

Source: lib/checks/structured-data/versions/1.0.0/docs.md

6. Version Changelog

structured-data v1.0.0 Changelog

Initial versioned package for structured-data.

  • Detects JSON-LD, Microdata, and RDFa.
  • Reports the primary structured-data syntax and mixed-format usage.
  • Emits typed entity summaries with format provenance.
  • Warns when multiple syntaxes are mixed without conflicts.
  • Fails when the same apparent entity conflicts across syntaxes.

Source: lib/checks/structured-data/versions/1.0.0/changelog.md