0

I follow up on query where the schema.org database is used to find the number of children of a class. The answer gives for each class the number of children. In my application I need the grand total of all children (i.e. the sum of the counts for each group) in order to compute for each group the percentage of the total number of children. The query I got from the previous question is:

prefix schema:  <http://schema.org/>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

select   ?child (count(?grandchild) as ?nGrandchildren) 
from <http://localhost:3030/testDB/data/schemaOrg>
where {
  ?child rdfs:subClassOf schema:Event .
  optional { 
    ?grandchild rdfs:subClassOf ?child
  }
}  group by ?child

which gives the expected answer (events and number of children). How to get the total number? I tried a nested query as:

prefix schema:  <http://schema.org/>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

select   ?child (count(?grandchild) as ?nGrandchildren) 
from <http://localhost:3030/testDB/data/schemaOrg>
where {
  select (count(?grandchild) as ?grandTotal) 
  {?child rdfs:subClassOf schema:Event .
  optional { 
    ?grandchild rdfs:subClassOf ?child
     }
   }
}  group by ?child

but got a single answer: " " -> 0.

user855443
  • 2,596
  • 3
  • 25
  • 37

1 Answers1

4

This query uses two sub-SELECTs: * the first computes the number of grandchildren per child * the second returns the total number of grandchildren

prefix schema:  <http://schema.org/>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

select ?child ?nGrandchildren 
(round(?nGrandchildren/?totalGrandchildren * 100) as ?percentageGrandchildren) {

# compute number per child
{
select ?child (count(?grandchild) as ?nGrandchildren) where {
  ?child rdfs:subClassOf schema:Event .
  optional { 
    ?grandchild rdfs:subClassOf ?child
  }
}
group by ?child
}

# compute total number
{
select (count(?grandchild) as ?totalGrandchildren) where {
  ?child rdfs:subClassOf schema:Event .
  optional { 
    ?grandchild rdfs:subClassOf ?child
  }
}
}

}

Output

-----------------------------------------------------------------------------------------------
| child                   | nGrandchildren | percentageGrandchildren                          |
===============================================================================================
| schema:UserInteraction  | 9              | "82"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:FoodEvent        | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:MusicEvent       | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:PublicationEvent | 2              | "18"^^<http://www.w3.org/2001/XMLSchema#decimal> |
| schema:LiteraryEvent    | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:SportsEvent      | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:DanceEvent       | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:ScreeningEvent   | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:DeliveryEvent    | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:ExhibitionEvent  | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:EducationEvent   | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:SaleEvent        | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:VisualArtsEvent  | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:CourseInstance   | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:ChildrensEvent   | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:BusinessEvent    | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:Festival         | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:ComedyEvent      | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:TheaterEvent     | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
| schema:SocialEvent      | 0              | "0"^^<http://www.w3.org/2001/XMLSchema#decimal>  |
-----------------------------------------------------------------------------------------------
UninformedUser
  • 8,397
  • 1
  • 14
  • 23
  • Perfect solution. I did not understand how to work with multiple sub-SELECTS; the answer is helpful to understand it. Thank you! – user855443 Nov 19 '17 at 00:13
  • In my application, the code `computer per child` is time consuming and I would like not to do it twice. Is there a way to use the result (count per child `?nGrandchildren`) and sum these for the grandTotal (`?totalGrandchildren`) and the percentage? – user855443 Nov 19 '17 at 00:53
  • That would need to `sum` over the `nGrandchildren` values in an outer query, but then you would lose the `child` variables. I don't think it's possible, but I might be wrong. But in general, it'S the task of the query optimizer to reuse results of join variables, thus, I don't know if this is really a bottleneck here. Which triple store do you use? Which dataset? For the schema.org data this task can't be expensive, it's a pretty simple query – UninformedUser Nov 19 '17 at 12:41
  • I use fuseki from jena and my application store has something like 400 M triples; I found the schema.org data useful to understand the issues - thank you for your help! the application query takes 4 minutes to run on an average PC (i5, 24 GB). -- I decided to compute the grand total in the statistical program which uses the result from this query. – user855443 Nov 19 '17 at 22:53