3

I created a little metro map with RDF/XML and wonder, how to query the distance between two stops. I'm very new to SPARQL and don't know how to start.

"Distance" means, that I want to know, how many stations are between the two ones. Later, I want to calculate the duration, but that's another point.

Thats my first approach:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ex: <http://example.com>

SELECT (count(?mid) as ?distance) WHERE {
  <http://example.com/StopD> ex:via* ?mid .
  ?mid ex:via+ <http://example.com/StopC> .
}

I think, that my query doesn't work because I'm using blank nodes? Doesn't work means, that I don't get the number of graphs that are between the two stops (like StopA and StopB). I have something like this in my mind: http://answers.semanticweb.com/questions/3491/how-can-i-calculate-the-length-of-a-path-between-2-graph-nodes-in-sparql/24609

Thats a sketch of my map. The numbers beside the lines represents the travel duration between two stations:

semantic-metro-map.JPG

My RDF code describe each station and its neighbour stops with available lines and travel duration. At the first look it looks quite redundant, but I want to include one-direction routes (e.g. for buses) in the future, so I think it's ok for the first try.

RDF (download the file here: http://gopeter.de/misc/metro.rdf)

<?xml version="1.0"?>
<rdf:RDF 
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:ex="http://example.com/">

    <rdf:Description rdf:about="http://example.com/StopA">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopB" />
            <ex:Line rdf:resource="http://example.com/Line1" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopB" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>7</ex:Duration>            
        </ex:via>       

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopD" />
            <ex:Line rdf:resource="http://example.com/Line4" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>               

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopD" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>6</ex:Duration>            
        </ex:via>                       

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopE" />
            <ex:Line rdf:resource="http://example.com/Line1" />     
            <ex:Duration>1</ex:Duration>            
        </ex:via>                               

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopF" />
            <ex:Line rdf:resource="http://example.com/Line4" />     
            <ex:Duration>3</ex:Duration>            
        </ex:via>                                       

    </rdf:Description>

    <rdf:Description rdf:about="http://example.com/StopB">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line1" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>7</ex:Duration>            
        </ex:via>       

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopC" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>10</ex:Duration>           
        </ex:via>               

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopF" />
            <ex:Line rdf:resource="http://example.com/Line3" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>                       

    </rdf:Description>      

    <rdf:Description rdf:about="http://example.com/StopC">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopB" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>10</ex:Duration>           
        </ex:via>

    </rdf:Description>  

    <rdf:Description rdf:about="http://example.com/StopD">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line2" />     
            <ex:Duration>6</ex:Duration>            
        </ex:via>

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line4" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>       

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopF" />
            <ex:Line rdf:resource="http://example.com/Line3" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>               

    </rdf:Description>      

    <rdf:Description rdf:about="http://example.com/StopE">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line1" />     
            <ex:Duration>1</ex:Duration>            
        </ex:via>

    </rdf:Description>  

    <rdf:Description rdf:about="http://example.com/StopF">

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopA" />
            <ex:Line rdf:resource="http://example.com/Line4" />     
            <ex:Duration>3</ex:Duration>            
        </ex:via>

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopB" />
            <ex:Line rdf:resource="http://example.com/Line3" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>

        <ex:via rdf:parseType="Resource">
            <ex:Stop rdf:resource="http://example.com/StopD" />
            <ex:Line rdf:resource="http://example.com/Line3" />     
            <ex:Duration>2</ex:Duration>            
        </ex:via>               

    </rdf:Description>      

</rdf:RDF>
Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
Slevin
  • 4,268
  • 12
  • 40
  • 90
  • "I think, that my query doesn't work because I'm using blank nodes?" What do you mean it doesn't work? You don't get results? Or don't get the results that you expected? – Joshua Taylor Jul 02 '14 at 19:49

1 Answers1

5

Why yours doesn't work

Let's take a look at your data in the more easily readable Turtle syntax (below). StopD connects to three blank nodes with the ex:via property. That means that you'll get four matches for ?mid with StopD ex:via* ?mid. You don't get any more, though, because there are no outgoing links from the blank nodes with the property ex:via. That means that there are no matches for ?mid ex:via+ StopC because ?mid doesn't have any outgoing ex:via links. Something like ?mid ex:Stop/ex:via+ StopC would be better, because the ex:Stop link gets you from the blank node to another stop.

@prefix ex:    <http://example.com/> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

ex:StopD  ex:via  [ ex:Duration  "6" ;
                    ex:Line      ex:Line2 ;
                    ex:Stop      ex:StopA
                  ] ;
        ex:via  [ ex:Duration  "2" ;
                  ex:Line      ex:Line4 ;
                  ex:Stop      ex:StopA
                ] ;
        ex:via  [ ex:Duration  "2" ;
                  ex:Line      ex:Line3 ;
                  ex:Stop      ex:StopF
                ] .

Even though you can add the addition ex:Stop to your property path, this still won't be computing distance just the way you want it though, because you won't be restricted to just one line. I.e., you'll get edges on multiple paths.

Making this work

I've recreated a simpler scenario:

@prefix : <https://stackoverflow.com/q/24538144/1281433/> .

#             B
#            * *
#        2  *   * 4
#          *     *
#         *       *
#       A +++++++++ C
#             3
#
# *** line 1
# +++ line 2

:StopA a :Stop ; :toLink :Link1 , :Link3 .
:StopB a :Stop ; :toLink :Link2 .
:StopC a :Stop .

:Link1 :hasDuration 2 ;
       :toStop :StopB ;
       :Line1Self :Link1 .

:Link2 :hasDuration 4 ;
       :toStop :StopC ;
       :Line1Self :Link2 .

:Link3 :hasDuration 3 ;
       :toStop :StopC ;
       :Line2Self :Link3 .

Each stop can connect to any number of links with :toStop. The line of each link is indicated with a rolification property for the line. E.g., link2 line1self link2 means that link2 is on line1. This means that we "stay on the right line" using a property path. Then, to find the duration of the trip from stopA to stopB on line 1, you can use a query like this:

prefix : <https://stackoverflow.com/q/24538144/1281433/>

select (sum(?duration) as ?length) where {
  :StopA :toLink/(:toStop/:toLink)*/:Line1Self ?link .
  ?link :hasDuration ?duration ;
        :toStop/(:toLink/:Line1Self/:toStop)* :StopC .
}

----------
| length |
==========
| 6      |
----------

To check for a different line, you just change the :LineXSelf properties. E.g., for line2:

prefix : <https://stackoverflow.com/q/24538144/1281433/>

select (sum(?duration) as ?length) where {
  :StopA :toLink/(:toStop/:toLink)*/:Line2Self ?link .
  ?link :hasDuration ?duration ;
        :toStop/(:toLink/:Line2Self/:toStop)* :StopC .
}
----------
| length |
==========
| 3      |
----------

Limitations

The are some limitations to this approach though. Property paths are your only option for doing arbitrarily deep queries like this, but you can't use variables in property paths, which means that you can't do the following to get the distance on each of the lines:

prefix : <https://stackoverflow.com/q/24538144/1281433/>

select ?line (sum(?duration) as ?length) where {
  values ?line { :Line1Self :Line2Self }

  :StopA :toLink/(:toStop/:toLink)*/?line ?link .
  ?link :hasDuration ?duration ;
        :toStop/(:toLink/?line/:toStop)* :StopC .
}
group by ?line
Community
  • 1
  • 1
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • OK, I think I have to revise my RDF code. I'm going to make use of reification, so that the connections between stops can easily described and the meta-informations (which line, which duration) are described as rdf:statement. Its much cleaner, isn't it? And I can use something like this: http://answers.semanticweb.com/questions/3491/how-can-i-calculate-the-length-of-a-path-between-2-graph-nodes-in-sparql/24609 – Slevin Jul 02 '14 at 20:07
  • Ah, you like my other answer, too? :) you'll need to reify *something* a bit, but you might also need some rolification (indicating the type of something via a self-link) if you're going to make this work with property paths. I'm working on a much smaller example now to see if I can make this work. It's a good problem. – Joshua Taylor Jul 02 '14 at 20:09
  • Which other answers? I'll have a look at rolification. I'm very new to semantic web :) – Slevin Jul 02 '14 at 20:11
  • @Slevin The one that you linked to on answers.semanticweb.com (along with the answers that it links to, too). :) – Joshua Taylor Jul 02 '14 at 20:12
  • Woooops, I did not realize it was you :D – Slevin Jul 02 '14 at 20:15
  • Yeah, the real tricky thing here is that (I expect) you want your property path to follow some particular line; you don't want it adding up durations from multiple lines that might connect stops. Unfortunately, that makes things much more complicated. – Joshua Taylor Jul 02 '14 at 20:21
  • Yeah, it should be an option: faster traveling or less changing the lines (as you can see in my sketch, changing the line is sometimes faster as just take only one line (from D to C)). – Slevin Jul 02 '14 at 20:31
  • I've updated the answer to show how you can get duration for a given line. There are still some limitations though, and I've shown those, too. – Joshua Taylor Jul 02 '14 at 20:38
  • Wow, thank you very much! I'll have a deeper look into it tomorrow. But I think, the approach is detailed enough, so I can continue working on it. – Slevin Jul 02 '14 at 20:43
  • @Slevin Actually, just updated again explaining some of the limitations. Beware! – Joshua Taylor Jul 02 '14 at 20:43
  • Yes, I've seen this, but I have some ideas how to figure out the whole problem because you gave me food for thought. I'll let you know when I've got a solution :) – Slevin Jul 02 '14 at 20:54
  • Just to let you know: I came across with some Python scripting. First, I converted my RDF triples into a networkx graph (Python library) and then used the dijkstras algorithm (also included in networkx). I think it's the cleanest solution, because of SPARQLs lack of algorithms like path finding or the shortest path problem. Here is the full example: https://github.com/gopeter/semantic-metro-map – Slevin Jul 06 '14 at 12:36