1

I'm pretty new with SPARQL and I'm trying to insert some kind of language filter for literals returned by my query template. The problem is the dataset doesn't provide a well structured tagging system so I got to restrict results in some conditional way AND make these conditions work with the following template:

SELECT distinct ?s ?p ?o WHERE {      
  ?this :is :something. // this pattern is passed as argument 
                        // to the js func that manage the 
                        // query
  {
   bind(?this as ?s)
   ?s ?p ?o.       
   // here i putted my filters
  }
}

The filter I want to add should provide a way to return only the "english version" for patterns with literals. if no english version is provided then return the untagged one - there is always an untagged one.

Based on solutions for similar questions, I tried with this:

filter(!isLiteral(?o) || (langMatches(lang(?o), 'en')) || lang(?o)='')

but - of course - ends up returning both: english and untagged literals.

Another way, that solved the problem for someone, is using two OPTIONAL pattern like here:

optional { 
  ?country rdfs:label ?label
  filter langMatches(lang(?label), "en")
}
optional { 
  ?country rdfs:label ?label
}

but I have the ?s ?p ?o pattern in my template that already returns all the triples associated with a specific subject, then optional patterns seems useless here.

I've read other questions similar to these but no one seems to fits my query template, so if someone can help me understanding this I'll be grateful.

  • interesting question, but I'm not sure if it's possible. The solution with the two OPTIONAL patterns works because it contains two steps by means of two consecutive jeft joins. In your query which is literally just a single triple pattern `?s ?p ?o .` the engine would have to create a group of values for each `(?s ?p)` pair and then inside this group prefer English over non-tagged literals. – UninformedUser Jun 20 '19 at 06:51
  • But even here, what if there are multiple literals? For example, we have a subject `:s` and a property `:p` and we have objects `"l1"@en, "l1", "l2", "l3"@en` - I mean, this examples makes clear how difficult it is, right? Just saying "give me all English literals" here would just return `"l1"@en, "l3"@en` which is not what you want. And we would get the same result "give me all English literals but if there aren't any I'll take also those without a language tag". – UninformedUser Jun 20 '19 at 06:55
  • So, what we really would need is to build more groups, in fact we would need groups per lexical form of the literal. In the example, we get three groups `{"l1"@en, "l1"}; {"l2"}; {"l3"@en}` - and now you could try to select the literals per each group based on language preference. – UninformedUser Jun 20 '19 at 06:57
  • I don't know if this is possible in SPARQL. But it might be more efficient to just use the `filter(!isLiteral(?o) || (langMatches(lang(?o), 'en')) || lang(?o)='')` filter and do the remaining filtering on the client side. Indeed, I might be wrong and I'm thinking to complicated, others here might have a solution, so let's wait for the real SPARQL experts. Indeed, you could try to make use of the `exists` feature and check if there is no English literal with the same lexical form. – UninformedUser Jun 20 '19 at 06:59
  • using the two OPTIONAL patterns the way the discussion I linked does always returned me both versions of my literals (let me add that there are max 2 vers per literal: '@en' and '@it', and when no english version is produced, the italian one appears tagless) – felix.jumanji Jun 20 '19 at 17:18
  • my last sentence was a little inaccurate: the '@it' tag on italian literals is sometime missing even in presence of the english version of the literal. what a messed dataset! – felix.jumanji Jun 20 '19 at 17:31

2 Answers2

1

If I'm interpretting your question right, it is doable using a single FILTER with several disjuncts ordered in manner of preference, e.g. like:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
select * where { 
    bind(skos:member as ?s)
        ?s ?p ?o .
        filter (
            !isLiteral(?o) ||
            langmatches(lang(?o), "en") 
            || (langmatches(lang(?o), "") && not exists {
                    ?s ?p ?other. 
                    filter(isLiteral(?other) && langmatches(lang(?other), "en")) 
                }))

} 
  • it will pass for any non-literal value bount ot ?o
  • then for literals with language tag that match @en
  • and finally, literals without language tag, for which there is no statement with the same subject and predicate, that has a literal with @en tag

HTH

the use of skos:member in the query above is just to have something bound to the ?s and it is arbitraty ...

Damyan Ognyanov
  • 791
  • 3
  • 7
  • This works prefectly to achieve my purpose. I was thinking of using EXISTS in some way but couldn't figure out how make it works. Thanks! – felix.jumanji Jun 20 '19 at 17:08
0

A bit late but a solution I just found is

SELECT distinct ?s ?p ?o WHERE {      
  ?this :is :something. // this pattern is passed as argument 
                        // to the js func that manage the 
                        // query
  {
   bind(?this as ?s)
   ?s ?p ?o.       
   FILTER(IF(NOT EXISTS {SELECT ?r WHERE {?s ?p ?r FILTER(lang(?r) = 'en')}}, lang(?o) = '', lang(?o) = 'en'))
  }
}

The filter FILTER(IF(NOT EXISTS {SELECT ?r WHERE {?s ?p ?r FILTER(lang(?r) = 'en')}}, lang(?o) = '', lang(?o) = 'en')) will give the results with 'en' tag of one exists otherwise, it will give the result with no language tags.