2

When I execute this sentence in the nobel prize database I got error when I avoid use LIMIT clause.

The next query works, because it has the LIMIT clause:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX nobel: <http://data.nobelprize.org/terms/>
PREFIX cat: <http://data.nobelprize.org/resource/category/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbp: <http://dbpedia.org/property/>
PREFIX dbr: <http://dbpedia.org/resource/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>

SELECT DISTINCT ?parentName ?childName 
WHERE {
  ?child owl:sameAs ?personChild ;
      foaf:name ?childName .

  SERVICE <http://dbpedia.org/sparql> {
    { ?personParent dbp:children ?personChild .  }
    UNION
    { ?personChild dbp:parents ?personParent . }
  }

  ?parent owl:sameAs ?personParent ;
      foaf:name ?parentName .
} LIMIT 2

It's bizarre because the same query doesn't works when I remove the LIMIT clause and instead of the result I got the next error message:

Error 500: HTTP 400 error making the query: Bad Request

What is the reason of this behavior? Am I doing something wrong?

Thanks.

Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
fingerprints
  • 2,751
  • 1
  • 25
  • 45
  • 2
    Is it the same query just without the `LIMIT`? The triple stores is `Fuseki - version 1.1.0` Maybe it fails because without the `LIMIT` because the federated query is too expensive, i.e. maybe leads to a timeout. Depending on the implementation indeed there might also be problems with the remote service that is called in the `SERVICE` clause. – UninformedUser Oct 19 '17 at 01:14
  • 1
    I'd suggest to setup your own triple store and load the data locally. Then you could get better error logs and have full control. At least loading the nobel prize RDF dump shouldn't be that time consuming. – UninformedUser Oct 19 '17 at 01:16
  • 1
    @AKSW thanks I'm going to try that. – fingerprints Oct 19 '17 at 08:05
  • 1
    @winter, I have edited my answer, things are sligtly different... – Stanislav Kralin Oct 20 '17 at 11:22

1 Answers1

2

I have loaded small part of triples from your Fuseki 1 into my Fuseki 2 and have analyzed network logs.

Executing your query, Fuseki (or rather ARQ) sends to DBpedia many queries of this kind (actually, prefixes are expanded):

SELECT  *
WHERE
  {   { ?personParent dbp:children  viaf:58991016 }
    UNION
      { viaf:58991016 dbp:parents  ?personParent }
  }

Suddenly, Fuseki sends this query:

SELECT  *
WHERE
  {   { ?personParent  dbp:children  <Barack Obama> }
    UNION
      { <Barack Obama>  dbp:parents  ?personParent }
  }

This strange URI in the query above is not valid. You can check this yourself, clicking "Barack Obama" on this page.

Virtuoso returns an error and Fuseki stops execution.

If the LIMIT clause is not omitted, then, having some luck, Fuseki retrieves from DBpedia sufficient number of results (and stops execution without an error) before sending the erroneous above query.

I suggest to add some filtering conditions to your query:

PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>

SELECT DISTINCT ?parentName ?childName 
WHERE {
  ?child owl:sameAs ?personChild ;
      foaf:name ?childName .
  FILTER (afn:namespace(?personChild) = str(dbpedia:))

  SERVICE <http://dbpedia.org/sparql> {
    { ?personParent dbpprop:children ?personChild .  }
    UNION
    { ?personChild dbpprop:parents ?personParent . }
    FILTER (isIRI(?personParent))
  }

  ?parent owl:sameAs ?personParent ;
      foaf:name ?parentName .
}

Run it!

The result should be:

+-------------------------------+----------------------+
|          parentName           |      childName       |
+-------------------------------+----------------------+
| "Marie Curie, née Sklodowska" | "Irène Joliot-Curie" |
| "Pierre Curie"                | "Irène Joliot-Curie" |
| "Karl Manne Georg Siegbahn"   | "Kai M. Siegbahn"    |
+-------------------------------+----------------------+

In the query above:

  • PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>afn: prefix declaration for Fuseki 1;

  • FILTER (afn:namespace(?personChild) = str(dbpedia:)) — filters out incorrect URIs (and also non-DBpedia URIs, reducing the number of queries);

  • FILTER (isIRI(?personParent)) — filters out occasional literal values of properties, reducing slightly DBpedia response size.


Now I understand, why you do not use DBpedia data about Nobel awards directly. The shortest path between Scylla of DBpedia data quality and Charybdis of Virtuoso 7 bugs seems to be the following:

SELECT DISTINCT ?dbpediaChild ?dbpediaParent {
    VALUES (?award2) { (dbr:Nobel_Prize_in_Chemistry)
                       (dbr:Nobel_Prize_in_Physics)
                       (dbr:Nobel_Peace_Prize)
                       (dbr:Nobel_Prize_in_Physiology_or_Medicine)
                       (dbr:Nobel_Prize_in_Literature) }
    VALUES (?award1) { (dbr:Nobel_Prize_in_Chemistry)
                       (dbr:Nobel_Prize_in_Physics)
                       (dbr:Nobel_Peace_Prize)
                       (dbr:Nobel_Prize_in_Physiology_or_Medicine)
                       (dbr:Nobel_Prize_in_Literature) }
    ?award1 a dbo:Award .
    ?award2 a dbo:Award .
    ?dbpediaChild  dbo:award/(dbo:wikiPageRedirects*)  ?award1 .
    ?dbpediaParent dbo:award/(dbo:wikiPageRedirects*)  ?award2 .
    ?dbpediaChild dbp:parents|^dbp:children ?dbpediaParent .
}

Run it!

However, the result will be only:

+-------------------------+--------------------+
|      dbpediaChild       |   dbpediaParent    |
+-------------------------+--------------------+
| dbr:Kai_Siegbahn        | dbr:Manne_Siegbahn |
| dbr:Irène_Joliot-Curie  | dbr:Marie_Curie    |
+-------------------------+--------------------+
Stanislav Kralin
  • 11,070
  • 4
  • 35
  • 58
  • Thanks for yours recommendations, i'm going to apply it! – fingerprints Oct 19 '17 at 12:45
  • I have one question about your first query, in your first proposition you add `?dbpediaChild (dbpprop:parents|^dbpprop:children) ?dbpediaParent .` instead of `UNION` clause but now, you're proposing use `UNION` again, is there a raison for that? – fingerprints Oct 20 '17 at 12:37
  • 1
    @winter, no, it doesn't matter. – Stanislav Kralin Oct 20 '17 at 13:09
  • @StanislavKralin - You referred to a "Charybdis of Virtuoso bugs". I don't see anything here that suggests any such. Would you please clarify and/or detail what you mean, so we have a hope of fixing code (if indeed there are bugs) or perception (if not)? – TallTed Feb 27 '18 at 21:35
  • @TallTed, as well as I remember, there were problems when two or more property paths with same variable are used in a query. I'll try to post the problematic query tomorrow. – Stanislav Kralin Feb 27 '18 at 21:43
  • @TallTed, please try to remove `?award1 a dbo:Award . ?award2 a dbo:Award .` from the above query. Obviously, these two patterns are superfluous, yeah? :) – Stanislav Kralin Feb 28 '18 at 15:00
  • 1
    @StanislavKralin I see the strange error in the [old Virtuoso 7 hosting dbpedia.org](http://dbpedia.org/sparql). Happily, I see it's resolved in the [current Virtuoso 8.1 (Enterprise Edition) build behind live.dbpedia.org](http://live.dbpedia.org/sparql/). (I'd give you a live link, but SO comment length limits are too short.) – TallTed Feb 28 '18 at 15:37
  • @TallTed, I have replaced "Virtuoso" with "Virtuoso 7" in my answer. Feel free to edit my post in order to replace it with something else. – Stanislav Kralin Feb 28 '18 at 15:43