3

SPARQL Query

I have some SPARQL query shown below:

SELECT DISTINCT ?name1 
    WHERE {

        GRAPH <blabla>
        {
            ?k swrc:author ?x .
            ?x foaf:name ?name1 . 
        } .

        GRAPH <blabla2>
        {   
            ?l swrc:author ?y .
            ?y foaf:name ?name2 .
        } .

        FILTER(?x != ?y) . 
    }

I want to get the names that exist only in the first graph blabla.

Problem

Counter intuitively I get some names that actually belong to the intersection. This happens because b (of set A) = b (of set B)?

Question

What are exactly the semantics of != ? How can I surpass this problem?

dsapalo
  • 1,819
  • 2
  • 19
  • 37
Paramar
  • 287
  • 1
  • 5
  • 22
  • 1
    A named graph is just a collection of triples. It doesn't "own" the resources that appear in its triples. RDF resources are either URIs, blank nodes, or literals. A URI is the same no matter what graph it's appearing in. That's actually one of the big _features_ of using URIs as identifiers. It means that you can describe some resource, and so can someone else. – Joshua Taylor Jan 30 '14 at 21:46
  • Given that URIs are identical regardless of where they appear, this seems like a duplicate of your earlier question [Set difference in SPARQL](http://stackoverflow.com/q/21391444/1281433), modulo the fact that you're using a different patterns to bind variables. It's still the same issue: if you want a set difference `{?e : e in A, e not in B}`, you need a query like `...?e is an element of A... filter not exists { ...?e is an element of B... }`. – Joshua Taylor Jan 30 '14 at 21:50

1 Answers1

6

The semantics of != are exactly that its left argument is not equal to its right argument. But a FILTER is evaluated for every possible combination of values - so the query as you formulated it will return all name-values of ?x for which some value of ?y is not equal to it.

If you want to get back only name-values of ?x for which all values of ?y are not equal to it, you should be using a NOT EXISTS clause:

SELECT DISTINCT ?name1 
WHERE {
 GRAPH <blabla>
 {
   ?k swrc:author ?x.
   ?x foaf:name ?name1. 
 }
 FILTER NOT EXISTS { 
     GRAPH <blabla2>
     {   
       ?l swrc:author ?x.
     }
  }

}

Note that using this approach you can actually get rid of variable ?y altogether: you change the condition to just check that author ?x as found in the first graph does not also occur in the second graph.

Jeen Broekstra
  • 21,642
  • 4
  • 51
  • 73