1

I am trying to use the SPARQL group_concat aggregate with ARC2, but it appears that it is not supported. Is there a workaround or an alternative I could use? Given the data:

@prefix recipe: <url> .
@prefix myrecipe: <url> .
@prefix myfood: <url> .

myrecipe:Pizza a recipe:Recipe ;
recipe:ingredients [ a recipe:IngredientList ;
    rdf:_1 [ a recipe:Ingredient ;
        recipe:food     myfood:Chicken ;
        recipe:quantity "225g" ;
    ] ;
    rdf:_2 [ a recipe:Ingredient ;
        recipe:food     myfood:Bacon ;
        recipe:quantity "125g" ;
    ] ;
]

I would like my query to return one result with multiple ingredients. My current query returns a new result for each ingredient:

SELECT ?recipe ?food
WHERE {
        ?recipe a recipe:Recipe ; 
        recipe:ingredients ?ingredientList .
        ?ingredientList ?p ?s .
        ?s a recipe:Ingredient ;
        recipe:food ?food
}

When I try using the group_concat aggregate, the query returns 0, and I can't find any reference to group_concat in the code base.

Example query using group concat:

SELECT ?recipe (GROUP_CONCAT(?food) as ?ing)
    WHERE {
        ?recipe a recipe:Recipe ; 
        recipe:ingredients ?ingredientList .
        ?ingredientList ?p ?s .
        ?s a recipe:Ingredient ;
        recipe:food ?food
    }

This is then passed to arc2 as SPARQL string. And the result returned is 0. This normally indicates an error with the query but I can't see one other than the use of group_concat

palacealex
  • 94
  • 9
  • Are you using `group_concat` correctly? I see a query that _doesn't_ use `group_concat`, but what's the query with `group_concat` that you used, and what were the actual results (I'm not clear what you mean by "the query returns `0`"). Without seeing your actual query, it's hard to tell whether there's something wrong with the query or something missing from ARC2. Please note that questions about code should include the actual problematic code, and that questions asking for code should include the attempted solutions. The working query here is fine, but we need the problematic one, too. – Joshua Taylor Dec 30 '13 at 14:52
  • Sorry, this was my first question so still getting to grips with the correct format for writing questions. I will edit the original question to include an example of the query I was running – palacealex Dec 30 '13 at 16:04

1 Answers1

1

The question (at least at the time that I'm writing) doesn't include an query that uses group_concat or shows results, so it's not immediately clear whether the problem here is in constructing a query using group_concat, or getting it to work with ARC2. However, ARC2's last release date is in 2011, whereas SPARQL 1.1 was published in 2013, so it would not be at all surprising if ARC2 does not support SPARQL 1.1. In particular, the version of ARC2 that can be downloaded from was last edited two years ago, and the ARC2 class has a getVersion function that returns 2011-12-01. SPARQL 1.1 (which defines group_concat) wasn't published until 21 March 2013, so I wouldn't be surprised if ARC2 only supports the first version of SPARQL. Additionally, the History section in the ARC2's readme says that

ARC started in 2004 as a lightweight RDF system for parsing and serializing RDF/XML files. It later evolved into a more complete framework with storage and query functionality. By 2011, ARC2 had become one of the most-installed RDF libraries. Nevertheless, active code development had to be discontinued due to lack of funds and the inability to efficiently implement the ever-growing stack of RDF specifications.

Cleaning up the data

Your data isn't in a completely usable format, since the rdf: prefix was defined. Here's data that we can use:

@prefix myrecipe: <https://stackoverflow.com/q/20827591/1281433/myrecipe/> .
@prefix recipe: <https://stackoverflow.com/q/20827591/1281433/recipe/> .
@prefix myfood: <https://stackoverflow.com/q/20827591/1281433/myfood/> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

myrecipe:Pizza  a           recipe:Recipe ;
        recipe:ingredients  [ a       recipe:IngredientList ;
                              rdf:_1  [ a                recipe:Ingredient ;
                                        recipe:food      myfood:Chicken ;
                                        recipe:quantity  "225g"
                                      ] ;
                              rdf:_2  [ a                recipe:Ingredient ;
                                        recipe:food      myfood:Bacon ;
                                        recipe:quantity  "125g"
                                      ]
                            ] .

A query using group_concat

A query like this will shows how you can use group_concat

prefix recipe: <https://stackoverflow.com/q/20827591/1281433/recipe/>
prefix myrecipe: <https://stackoverflow.com/q/20827591/1281433/myrecipe/>
prefix myfood: <https://stackoverflow.com/q/20827591/1281433/myfood/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

select ?recipe (group_concat(?food) as ?ingredients) where {
  ?recipe a recipe:Recipe ; 
  recipe:ingredients ?ingredientList .
  ?ingredientList ?p ?s .
  ?s a recipe:Ingredient ;
     recipe:food ?food
}
group by ?recipe
------------------------------------------------------------------------------------------------------------------------------------------
| recipe         | ingredients                                                                                                           |
==========================================================================================================================================
| myrecipe:Pizza | "https://stackoverflow.com/q/20827591/1281433/myfood/Bacon https://stackoverflow.com/q/20827591/1281433/myfood/Chicken" |
------------------------------------------------------------------------------------------------------------------------------------------

More about group_concat

group_concat is described in 18.5.1.7 GroupConcat from the SPARQL specification. Some other questions and answers that include the use of group_concat might be of interest:

Some notes about the data…

I'd point out that the triple pattern ?ingredientList ?p ?s in the query is being used to get elements of your ingredient list. This might get more information than what you're actually looking for, though. If you're using a query engine that can do a bit of inference, then you might be interested in using rdfs:member, which is a superproperty of all the rdf:_nnn properties:

5.1.6 rdfs:member

rdfs:member is an instance of rdf:Property that is a super-property of all the container membership properties i.e. each container membership property has an rdfs:subPropertyOf relationship to the property rdfs:member.

In principle, this means that you could write your query as:

select ?recipe (group_concat(?food) as ?ingredients) where {
  ?recipe a recipe:Recipe ; 
          recipe:ingredients [ rdfs:member [ recipe:food ?food ] ] .
}
group by ?recipe

Now, you'd need a query engine that can also do RDF(S) reasoning for that to work. However, if you used an actual RDF list, you could write a query directly. In particular, if you represented your data like this:

myrecipe:Pizza a recipe:Recipe ;
               recipe:ingredients ( [ a recipe:Ingredient ;
                                      recipe:food myfood:Chicken ;
                                      recipe:quantity "225g" ]
                                    [ a recipe:Ingredient ;
                                      recipe:food myfood:Bacon ;
                                      recipe:quantity "125g" ] ) .

and a query like this:

select ?recipe (group_concat(?food) as ?ingredients) where {
  ?recipe a recipe:Recipe ; 
          recipe:ingredients [ rdf:rest*/rdf:first [ recipe:food ?food ] ] .
}
group by ?recipe

to get the same results (in this case), but you'd be guaranteed to only get results that are actually elements of the ingredient list, and you wouldn't need to use any RDF(S) reasoning. As a matter of opinion, I think it also makes the data and the query a little bit clearer, too.

Community
  • 1
  • 1
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • Thanks for the reply, some good points there regarding how to tidy up my data structure and query. However it hasn't resolved the issue - I excluded rdf from the prefix list as it is included in the default list declared within arc2. What seems to be the problem is that arc2 doesn't appear to support group_concat - are you aware of another way to replicate the behavior? – palacealex Dec 30 '13 at 09:53
  • @palacealex Sorry for the misunderstanding; since there was no query in the question that actually used `group_concat` or showed problematic results with it, I thought the problem was in formulating a working `group_concat` query. (This is one of the reasons that it's so important to include attempted solutions and results (both actual and expected).) I've commented on the question, and also made a note about ARC2's release date which is before SPARQL 1.1 (which defines `group_concat`) was published. – Joshua Taylor Dec 30 '13 at 14:59