2

Imagine that you have a simple social network where people must have only one property rdfs:label with value "Person" and can have any number of foaf:knows whose values must also be people with the same structure. Some example data could be:

:peter foaf:knows :john; 
       foaf:knows :anna;
       rdfs:label "Person" .

:john  foaf:knows :anna;
       rdfs:label "Person" .

:anna  rdfs:label "Person" .

In logic terms, the definition could be something like:

∀x(Person(x) ≡ rdfs:label(x,"Person")∧∀y(rdfs:label(x,y)→y="Person")∧∀y(foaf:knows(x,y)→Person(y)))

Is it possible to express those recursive definitions in SPARQL?

I was able to express part of the query without the recursive reference of foaf:knows as:

PREFIX ex: <http://xmlns.com/foaf/0.1/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>

select ?person {

    # Ensure there is only one rdfs:label
    { SELECT ?person {
      ?person rdfs:label ?o .
    } GROUP BY ?person HAVING (COUNT(*)=1)}

    # Ensure the rdfs:label value is "Person"
    { SELECT ?person {
      ?person rdfs:label ?o . 
      FILTER ((?o = "Person"))
    } GROUP BY ?person HAVING (COUNT(*)=1)}

    # Count the number of foaf:knows
    { SELECT ?person (COUNT(*) AS ?p_c0) { 
       ?person foaf:knows [] . 
      } GROUP BY ?person
    }

    # Count the number of foaf:knows with a value that has the structure of a person
    { SELECT ?person (COUNT(*) AS ?p_c1) { 
       ?person foaf:knows ?person1 . # How can I express that person1 has the structure of people?
      } GROUP BY ?person
    }
    FILTER (?p_c0 = ?p_c1)
}

Is it possible to express such recursive definitions in SPARQL?

Note: I edited the question changing the term "constraint" by "definition" following Joshua's suggestion

Labra
  • 1,412
  • 1
  • 13
  • 33

2 Answers2

7

This is a definition, not a pair of constraints

We often think of definitions in terms of necessary and sufficient conditions. Sufficient conditions are those that give us "enough" information to conclude that something is a element of a given set, and necessary conditions are those that tell us a bit more about the individuals. For instance, in OWL, we might have the axioms:

Man &sqsubseteq; Person
Person &sqsubseteq; ∃hasName

The first is a sufficient condition for Person: knowing that something is a Man is sufficient to determine that it's also a Person. The second is a necessary condition for Persons: if something is a person, then it must have a name. (Dually, we can also note that the first axiom is necessary condition for Man: if something is a Man, then it must be a Person. The second axiom is a sufficient condition for ∃hasName; if something is a Person, then it must have a name.)

Constraint checking is typically the task of finding individuals that meet the sufficient conditions for a class, but don't satisfy all the necessary conditions. That's not what you're trying to do here. Instead, you're looking for individuals that meet the necessary and sufficient conditions of personhood:

  1. have exactly the label "Person"
  2. know only other persons.

In constraint validation, you'll write a query that that finds problematic individuals (e.g., things that are supposed to be persons, but aren't), but in your task, you'll find good individuals.

Finding individuals that meet your specification

In general you can't specify recursive definitions in SPARQL, but in this case, you can write a query that will select all people. The trick is to first use a pattern that identifies all nodes in the graph. Then, conceptually, we suppose that each one is a person, and then filter out those that don't meet the conditions. That's possible in this case, because the condition is simply that everything reachable by a chain of foaf:knows (including the zero length chain) should have the label "Person" and nothing else. Here's some sample data (including the examples from your answer), the query, and finally the results.

@prefix : <http://stackoverflow.com/q/25256452/1281433/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.

:peter foaf:knows :john; 
       foaf:knows :anna;
       rdfs:label "Person" .

:john  foaf:knows :anna;
       rdfs:label "Person" .

:anna  rdfs:label "Person" .

:mary rdfs:label "Person" .

:tom rdfs:label "Cat" .

:pluto rdfs:label "Dog" ; foaf:knows :tom .

:ben rdfs:label "Wolf"; rdfs:label "Person" .

:mary rdfs:label "Person"; foaf:knows :ben .

:sam rdfs:label "Person"; foaf:knows :mary .
prefix : <http://stackoverflow.com/q/25256452/1281433/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix foaf: <http://xmlns.com/foaf/0.1/>

select ?person where { 
  #-- each node in the graph
  ?person :? ?person .

  #-- except those that foaf:know* some ?x
  #-- (and since * includes the zero length
  #-- path, ?x is also bound to ?person)
  #-- that don't meet the labeling condition.
  filter not exists {
    ?person foaf:knows* ?x
    optional { ?x rdfs:label ?label }
    filter ( !bound(?label) || ?label != "Person" )
  }
}
----------
| person |
==========
| :anna  |
| :john  |
| :peter |
----------

Constraint Checking

Now, suppose that conditions 1 and 2 above were actually necessary conditions for personhood. Then, depending on the sufficient conditions, we could write different queries to find violations. You probably want to have non-person nodes in the graph, so you might have a sufficient condition that a node has rdf:type :Person. Then you can use a query like this:

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix : <http://stackoverflow.com/q/25256452/1281433/>

#-- There are two way something with type :Person
#-- can be invalid.  (i) it can lack a label, or 
#-- have a label other than "Person";  (ii) it 
#-- can have a value of foaf:knows* that doesn't
#-- have rdf:type :Person.

select ?person where {
  #-- Examine each person in the graph.
  ?person a :Person .

  { #-- Check that ?person has a label, and that
    #-- that it has no label other than "Person"
    optional { ?person rdfs:label ?label } 
    filter ( !bound(?label) || ?label != "Person" )
  } UNION
  { #-- Check that every value of foaf:knows 
    #-- also has type :Person.  If some value
    #-- has type :Person, but violates the constraints,
    #-- we'll catch it separately.
    ?person foaf:knows ?x .
    filter not exists { ?x a :Person }
  }
}
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • Thanks for your answer. Yes, I wanted to check policy violations and I also wanted to avoid having a explicit type :Person. And there comes the recursion: I want to identify a Person by its structure which in this case it means that it has only one rdfs:label with value "Person" and that it knows only resources with the structure of a Person. – Labra Aug 14 '14 at 06:15
  • @Labra If you *define* a person as something which has exactly the label "Person" and knows only other other Persons, then there can be no policy **violations**; there are only things which are Persons, and things which are not. That's why it's so important to pin down an answer "what's supposed to be a person?" – Joshua Taylor Aug 14 '14 at 11:38
  • @Labra As I mentioned in the comments, if your conditions are the definition of person, then it's not really a constraint, because there are no violations. However, in this case, you can also write a query that selects individuals satisfying the definition. I've updated my answer. – Joshua Taylor Aug 14 '14 at 12:01
  • Yes, I think your answer really solves the problem. Great!. It seems that the recursive is somewhat captured in the zero-length path. My recursive definition of Person was something like: `ForAll x(Person(x) <-> rdfs:label(x,"Person") /\ ForAll y (rdfs:label(x,y) -> y = "Person") /\ ForAll y (foaf:knows(x,y) -> Person(y))` – Labra Aug 14 '14 at 12:26
  • @Labra I just committed another edit, too. I think things are explained clearly now. As I said, in general you can't do recursive things, but in this case the definition is just that: ?x is person if every ?y such that ?x foaf:knows* ?y has exactly the label "Person". – Joshua Taylor Aug 14 '14 at 12:32
1

One problem with the previous answer is that it needs an extra predicate to get values of type :Person.

However, the problem is that we would like to identify a Person by its structure, which in this case means that:

  1. It has only one rdfs:label property with value "Person"
  2. It has zero or more foaf:knows properties whose values follow these constraints.

And notice that the second constraint has a recursive nature because it is a self-reference.

A partial solution (without the recursive constraint) could be:

prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix : <http://stackoverflow.com/q/25256452/1281433/>

select ?person where {
  #-- Examine each non-literal node in the graph.
  ?person :* ?person .
  filter ( !isLiteral(?person) )

  { #-- Check that ?person has a label, and that
    #-- it has no label other than "Person"
    optional { ?person rdfs:label ?label } 
    filter ( !bound(?label) || ?label != "Person" )
  } UNION
  { #-- Check that every value of foaf:knows 
    #-- satisfies the constraints of Person
    ?person foaf:knows ?x .
    filter not exists { 
      ?x rdfs:label "Person" .
      # Here I would like to also express that ?x only knows 
      # values that satisfy the constraints of Person 
      # but this would be recursive
    }
  }
}

We can add some more erroneous values to check if it detects them:

:tom rdfs:label "Cat" .

:pluto rdfs:label "Dog" ; foaf:knows :tom .

:ben rdfs:label "Wolf"; rdfs:label "Person" .

:mary rdfs:label "Person"; foaf:knows :ben .

:sam rdfs:label "Person"; foaf:knows :mary .

The previous SPARQL query detects :pluto, :tom, and :ben, but it does not detect :mary or :sam which don't follow the constraints.

Labra
  • 1,412
  • 1
  • 13
  • 33
  • If you're just trying to identify persons as things that have a particular structure, then it doesn't make sense to talk about policy violations, because there are none. There are simply things that meet your definition and things that don't. If something doesn't meet the definition, it's not a violation, it just doesn't happen to be a person. – Joshua Taylor Aug 14 '14 at 11:40
  • My intention was to identify the resources that had some structure whose definition was recursive. It is very nice that it can be solved using the zero-length property paths. Thanks! – Labra Aug 14 '14 at 12:35