2

I'm using Jena to write a rdf file that describes online posts. According to the sioc ontology/namespace that I'm using there is, for instance, the following:

  • Class: sioc:Post
  • Property: sioc:has_creator

How can I, in Jena, include the sioc:Post in the file as

<sioc:Post rdf:about="http://example.com/vb/1035092"> 

instead of

<rdf:Description rdf:about="http://example.com/vb/1035092">

and what is the best practice?

user2179347
  • 155
  • 4
  • 13

3 Answers3

2

Both of the answers so far make good points:

  • You should not pay much attention to the particular serialization of your RDF graph, because there are lots of different serializations, and you should be accessing them using an API that exposes the graph, not the serialization. (See, for instance, Don't query RDF (or OWL) with XPath in one of my previous answers, for some comments about depending on a particular XML serialization.)
  • The difference that you're seeing is that the most simple RDF/XML serialization will use lots of rdf:Description elements, and these will contain rdf:type elements to indicate the types of the described element. However, the RDF/XML serialization format defines many abbreviations that can be used to make the serialization of a graph much shorter, more readable, and, in some cases, more like a traditional XML document. Others have mentioned that using the type as the element name is just one such abbreviation, but I think it's worth examining the spec on this point. This particular abbreviation is defined in 2.13 Typed Nodes:

It is common for RDF graphs to have rdf:type predicates from subject nodes. These are conventionally called typed nodes in the graph, or typed node elements in the RDF/XML. RDF/XML allows this triple to be expressed more concisely. by replacing the rdf:Description node element name with the namespaced-element corresponding to the RDF URI reference of the value of the type relationship. There may, of course, be multiple rdf:type predicates but only one can be used in this way, the others must remain as property elements or property attributes.

The typed node elements are commonly used in RDF/XML with the built-in classes in the RDF vocabulary: rdf:Seq, rdf:Bag, rdf:Alt, rdf:Statement, rdf:Property and rdf:List.

For example, the RDF/XML in Example 14 could be written as shown in Example 15.

Example 14: Complete example with rdf:type (example14.rdf output example14.nt)

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:ex="http://example.org/stuff/1.0/">
  <rdf:Description rdf:about="http://example.org/thing">
    <rdf:type rdf:resource="http://example.org/stuff/1.0/Document"/>
    <dc:title>A marvelous thing</dc:title>
  </rdf:Description>
</rdf:RDF>

Example 15: Complete example using a typed node element to replace an rdf:type (example15.rdf output example15.nt)

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:ex="http://example.org/stuff/1.0/">
  <ex:Document rdf:about="http://example.org/thing">
    <dc:title>A marvelous thing</dc:title>
  </ex:Document>
</rdf:RDF>

If you're using Jena, you can get extensive control over the way that your RDF/XML output is formatted. These options are documented in the Advanced RDF/XML Output section of the documentation. However, for the case that you want, simply serializing in RDF/XML versus RDF/XML-ABBREV will take care of what you want to do. For instance, look at the results using the Jena command line rdfcat tool. Here's our data (in Turtle):

# The actual namespace doesn't matter for this example.
@prefix sioc: <http://example.org/> . 

<http://example.com/vb/1035092>
  a sioc:Post ;
  sioc:has_creator "someone" .

Let's convert this to simple RDF/XML:

$ rdfcat -out RDF/XML data.n3
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:sioc="http://example.org/" > 
  <rdf:Description rdf:about="http://example.com/vb/1035092">
    <rdf:type rdf:resource="http://example.org/Post"/>
    <sioc:has_creator>someone</sioc:has_creator>
  </rdf:Description>
</rdf:RDF>

Now let's convert it to RDF/XML-ABBREV:

$ rdfcat -out RDF/XML-ABBREV data.n3
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:sioc="http://example.org/">
  <sioc:Post rdf:about="http://example.com/vb/1035092">
    <sioc:has_creator>someone</sioc:has_creator>
  </sioc:Post>
</rdf:RDF>

In the first case you see an rdf:Description element with rdf:type and sioc:has_creator subelements, but in the second case you see a sioc:Post element with only a sioc:has_creator subelement.

As to best practice, I don't know that it really matters. The RDF/XML-ABBREV will typically be a bit shorter, so would incur less network overhead on transmission, storage on disk, and would be easier to read. The simpler RDF/XML will be a faster to write, though. On most graphs this won't make a big a difference, but generating RDF/XML-ABBREV can be pretty expensive, as a recent thread on the Jena mailing list discusses.

Community
  • 1
  • 1
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
1

You really should not get hung up on what the computer readable output of your data looks like. Jena produces valid RDF, any other RDF parser/framework is going to be able to read it in and let you do stuff with it.

The style format you want is not valid, it would need to be rdf:ID in your example, and it means that the thing identified by the URI is a sioc:Post. In the latter case, that's basically just a container for stuff about that URI; you'll see a separate rdf:type triple to assert that the individual is a sioc:Post.

But seriously, to re-iterate, what the RDF output looks like is not relevant. If you want it to look a certain way because you're going to edit it by hand, don't. Go get a tool like Protege or TopBraid and use that.

Michael
  • 4,858
  • 19
  • 32
1

Jena has two RDF/XML writers. Use RDF/XML-ABBREV to get the more readable format.

As Michael rightly says, though, don't be obsess about it. Parsers don't care.

user205512
  • 8,798
  • 29
  • 28