How far to go with Structured Data Markup?

Question

The more I am going into the depths of structured data makeup, the more complex and detailed it seems to become. One could even markup areas of the page like footer, header, sidebar, single menu elements etc., I guess a page could easily consist of 80% schema markup and 20% content when taken too seriously. :)

Is it really doing any good to add more than a rough markup skeleton (WebPage or Article) to the potentially hundreds of actual content pages of a website, and shouldn't one only include full author information along with business opening times, contact details etc. on a dedicated contact/business information page? I'm concerned about the bloat. Which kind of markup is recommended for certain types of pages and which of it can be left out because a search engine would compile the information from other parts of the website anyway?

Great question. I guess the overall assessment is "diminishing returns" but of course then you'd have to rank your goals... accessibility... SEO.... appeasing the Google Gods :( Personally I "just far enough" ;) — mayersdesign, Apr 07 '17 at 14:45

score 0 · Accepted Answer · edited May 23 '17 at 12:25

If you only care about user-visible search result features in big seach engine services (e.g., Bing, Google Search, Yahoo! Search and Yandex, which all happen to sponsor Schema.org), the answer is easy: Provide what search engines document to recognize.

Are these user-visible search result features the only things search engines "do" with Schema.org structured data? Probably not. They’ll likely use structured data to better understand page content, and most likely to analyze what other features they could offer in the future. See for example Dan Brickley’s (he is Google’s Schema.org representative) posting about this. But all this is typically not documented by the search engines, of course. So if you care about this, too, the answer would be: Provide what is conceivable to be useful for search engines.

Are search engines the only consumers interested in Schema.org structured data? No, there are countless other consumers (services as well as tools). Enter the world of the Semantic Web and Linked Data. If you know and care about a consumer, the answer is easy again: Provide what this consumer documents to support. But you can’t know them all, of course. So if you care about all these (known and unknown, currently existing and still to appear) consumers, the answer would be: Provide what is conceivable to be useful for all consumers. Or, because the interest of these consumers varies widely, even: Provide what you can.

That said, there are certainly Schema.org types which are rarely useful to provide. A good example are the WebPageElement types, which, as you mentioned, can get used for page areas (header, footer, navigation, sidebar etc.). In my opinion, a typical web page shouldn’t provide these types.

If you care about file sizes, you’ll want to use Microdata/RDFa (because these syntaxes allow you to annotate existing content) instead of JSON-LD (because this syntax requires you to duplicate the content). With RDFa you’ll probably even save slightly more compared to Microdata.
However, structured data typically only represents a fraction of the markup/content anyway, even if you provide as much data as possible.

Instead of repeating "background information" on every page (for example, the full data about the business), you can make use of references: you define a URI for your business (or every other thing) on the page where you fully describe it, and use this URI as property value where applicable on other pages. This is possible with @id (JSON-LD, see an example), itemid (Microdata), and resource (RDFa). The only reason not do this is possibly lacking consumer support for such references (depending on the consumer / the use case, they might not get followed). A middle way might be to provide the item (about the business or any other thing) on every page, once with the full data, and in all other cases with a limited set of data (ideally what is visible on the page, or what is needed for a specific consumer). The URI gets used as identifier for each item, conveying that all these items are about the same thing.

Thanks for your valued feedback! So would you recommend to add full-fledged business information on every 'Article' page or should it be sufficient to have it only on pages providing information about the actual business - and will a search engine make the connection through the 'publisher' property? — richey, Apr 08 '17 at 10:21
@richey: I added a paragraph to my answer (last one) about references. For the case of user-visible search result features, I think no search engine is following references to other pages. For these features, everything that’s needed has to be on the page, which makes sense, of course, because this page is what their users get in the results. -- If search engines follow such references for their other uses of structured data is not documented. -- A good rule is to always provide what is shown on the page, and use the identifier for the item so interested consumers can get the full description. — unor, Apr 08 '17 at 15:13

How far to go with Structured Data Markup?

1 Answers1