8

There are two "generic" types of metadata tags in DITA, the data element and the keyword element. Of course there's also the othermeta, but apparently that's supposed to be deprecated soon, and the name suggests its sort of a last resort anyhow.

So the keyword seems to closely resemble tags in web applications, i.e. what is commonly used for "folksonomies". But what is the exact difference between data and keyword, and when should you use which?

Joe DF
  • 5,438
  • 6
  • 41
  • 63
Anders
  • 12,556
  • 24
  • 104
  • 151

4 Answers4

3

The <data> element is primarily for specialization so it's probably not wise to use it directly. The <keyword> element is better.

This:

    <metadata>
        <keywords>
            <keyword>red</keyword>
            <keyword>green</keyword>
            <keyword>blue</keyword>
        </keywords>
    </metadata>

will render to this in the DITA-OT XHTML transform:

<head>
  <meta name="DC.subject" content="red, green, blue"/>
  <meta name="keywords" content="red, green, blue"/>
</head>

If you want to add tags, I'd consider using subject scheme maps, which will allow you to include a list of controlled values.

If you specialize the @base or @props attribute, you can add metadata with much more control. Here, we have a @props attribute specialized to @era.

You can then add the @era attribute to an element in a topic, or to the <topicref> element in a map.

<subjectdef keys="era_attributedef">
  <topicmeta>
  <navtitle>Era of production by decade and producer</navtitle>
</topicmeta>

 <subjectdef keys="producer">
     <hasInstance>
         <subjectdef keys="sixties">
             <subjectdef keys="verity_lambert"/>
             <subjectdef keys="john_wiles"/>
             <subjectdef keys="innes_lloyd"/>
             <subjectdef keys="peter_bryant"/>
             <subjectdef keys="derrick_sherwin"/>
         </subjectdef>

         <subjectdef keys="seventies">
             <subjectdef keys="barry_letts"/>
             <subjectdef keys="philip_hinchcliff"/>
             <subjectdef keys="graham_williams"/>
         </subjectdef>

         <subjectdef keys="eighties">
             <subjectdef keys="john_nathan-turner"/>
         </subjectdef>
     </hasInstance>
 </subjectdef>

<enumerationdef>
    <attributedef name="era"/>
    <subjectdef keyref="era_attributedef"/>
</enumerationdef>
johntait.org
  • 740
  • 1
  • 8
  • 22
  • Well, I would agree with you that I too lean towards using the keyword element for generic metata, but that's the confusion. I know the data element is intended for specialization, but apparently not just that. In Eliot Kimber's book DITA for Practitioners it's stated that data is the primary element "for holding arbitrary metadata". So I don't feel the distinction is very clear. Regarding the subjectScheme, that mainly concerns attributes, and does not quite answer this question. – Anders Feb 28 '13 at 21:15
  • I remember reading about a DITA CMS using itself during rendering. I'll add more to the answer if I find the source. – johntait.org Feb 28 '13 at 23:33
3

You are a little off-track here; the keyword element is NOT a metadata element. The keyword element is a generic text element, often used for product names. I think the element that you probably wanted to specify here was the keywords element. Also, you really don't want to write off the othermeta element; it is not deprecated and quite useful.

keywords element

The keywords element can be used either at the topic or map level. It holds a list of terms from a subject vocabulary, tagged with either the keyword or indexterm elements. The keyword and indexterm elements are considered metadata elements, and they should be reflected in output as appropriate for the medium. The indexterm elements commonly generate indices; in XHTML output, the keyword elements generally are added to the XHTML and used for search-engine optimization. (This is standard functionality of the DITA-OT, although the free PDF rendering engine that ships with the DITA-OT does not generate an index.)

data element

Used as-is, the data element represents a property within a DITA topic or map. The following are the key aspects:

  • The subject of the property is the element that contains the data element. If the property applies to a topic as a whole, it should go in the topic prolog element or in a topicmeta element in a topicref that points to the topic.
  • The @name attribute on the data element is the primary identifier for processors.
  • The value of the property can be expressed in several different ways:
    • Text value, often expressed using the @value attribute
    • Reference to another resource (topic, image, Web resource, etc.), using the @href attribute
    • Complex structure that is composed of nested data elements
  • You can use an optional title element to provide a label for property.

By default, processors ignore the content of data elements. However, custom processing can be built that uses the content of specific data elements for formatting and so forth.

Used as a basis for specialization, the data element is especially useful. It enables more precise semantics, as well as enumerations of controlled lists of attributes for specific elements. You can see many examples of its use as a specialization base if you examine the metadata elements used in the bookmap and learning & training specializations.

See the data element topic in the DITA 1.2 specification for some concrete examples.

othermeta element

The othermeta element is designed to hold content for which no existing metadata element seems to apply. It essentially holds a name and value pair. You use the @name attribute to name the property and the @content attribute to hold the value.

When should you use which particular element?

  • Use the keywords element to specify index terms and keywords that apply to a specific topic, especially when the content of the keywords element should be used in the generated output.
  • Use the data element to embed properties within a DITA topic or DITA map, especially as an aid for custom processing or to harvest properties for automated processing.
  • Use the data element as a basis for specialization.
  • Use the othermeta element to hold name and value pairs for which a semantic element does not exist.
  • Thanks, but I don't quite agree with everything. I know that keyword is also used as a reuse element, sort of like a variable, but the problem is it is also used as a metadata element, as evidenced by the spec: "All or elements in the metadata element are considered part of the topic's metadata". And also by the fact that it is part of the metadata element in the prolog... But maybe this is part of the confusion around this element, that it is used for different purposes? +1 for the tip about the learning specialization's use of the data though. – Anders Mar 11 '13 at 00:50
  • The thing about othermeta supposedly being deprecated I cannot confirm, but I read it in a post from Eliot Kimber (http://tech.groups.yahoo.com/group/dita-users/message/29812), where actually he said it was obsolete, but I don't really know. Maybe he could shed some light on it... I do not feel it's very useful in any case though, as I said, the very name suggests it's a last resort. I'd rather use the data element, even unspecialized, in such a case, as it has name and value attributes, pretty much the same as othermeta's name and content. – Anders Mar 11 '13 at 00:57
1

The data element has the @href and the @name and @value attributes that the key.

So you can define any kind of property you may need for your build.

<data name="currentTopNavSection" value="profil"/>

I have a couple of scenarios where I need to provide some path information depending the audience of my documentation. I can use the data element for this.

<data audience="lifeg" name="active-audience" value="lifeg"/>

This one allows me to know which is the active audience when I filter the documentation

Another example would be to attach a javascript which would be specific to a map

I am currently working on a webmap specialization where I specialized data to include javascript and css .

* update 2 *

the data element could be nested. Eliot Kimber explains this in a post. I can not remember which one. The idea is that it can represent a collection of properties

   <data name="parent">
        <data name="chilproperty1" value="abc"/>
        <data name="chilproperty2" value="abc"/>
   </data>

this structure is very usefull for specialization purpose.

In my understanding, the data element is not specific. It is a way for authors to document very specific need, specialized or not. It is easy to retrieve the values with xsl later in the build process.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
Bertrand
  • 388
  • 4
  • 13
  • You're right, forgot about that difference, that of course is an indicator of their respective different intents. Still wish this was better documented in the spec. – Anders Mar 06 '13 at 22:22
  • Forgot to talk about the @keyref, which could be very usefull – Bertrand Mar 06 '13 at 22:46
1

The exact difference? There are many differences. Read the spec (sorry, I don't mean to sound unfriendly).

Here is one difference from the spec that has already been mentioned, but I think is worth highlighting, because it might help you decide which to use (or rather, help you decide whether or not to use <data>):

Processors should ignore the content of the <data> element by default, so the <data> element should only be used for properties and not to embed text for formatting as part of the flow of the topic body.

(See also text elsewhere in the spec beginning: "custom processing may...".)

You can use <keyword> "to embed text for formatting as part of the flow of the topic body", but you should not do that with <data>.

Can you describe your specific use case? (What information do you want to mark up?)

Graham Hannington
  • 1,749
  • 16
  • 18
  • 1
    I have read the spec, quite carefully, and I guess that's where the question is coming from, I think it's very vague about the usage of metadata elements. I know of the obvious differences of their data models, and especially the usage of keyword in a non-metadata context. But, keyword can also be used as a metadata element, and I intend their differences as metadata tags, i.e. keyword as used in the keywords tag in the prolog metadata elements. This is a different usage compared to when used in body text, and the spec says to treat it as comparable to the subject tag of Dublin Core. – Anders Mar 11 '13 at 12:49