How to get only the first value from an optional property?

Question

Like in the SQL aggregate MAX, MIN or FIRST, it gets only one value, not duplicating lines.

Real Wikidata case

Where the OPTIONAL clause expands from 253 to 257 lines:

# Countries and its codes
SELECT ?code  ?item ?itemLabel ?osmId
WHERE 
{
  ?item wdt:P297 ?code.
  OPTIONAL{?item wdt:P402 ?osmId .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code

try here

I need only one (any) osmId.
How to do something like FIRST{OPTIONAL{?item wdt:P402 ?osmId .}} ?

NOTES:

it is not a duplicate of How to get only the most recent value from a Wikidata property?
it is not a duplicate of Why does this Wikidata SPARQL query only work for the first element in a list?
... no exactly need for simple "any first".

Peter Krauss · Answer 1 · 2021-03-08T01:09:47.223

Here a WIKI answer (please you can edit to enhance this answer!)

# Countries and its codes
SELECT ?code  ?item ?itemLabel
       (MAX(?osmId) as ?osmId_max) (COUNT(?code) as ?osmId_n)  
WHERE 
{
  ?item wdt:P297 ?code.
  OPTIONAL{?item wdt:P402 ?osmId .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code  ?item ?itemLabel 
ORDER BY ?code

try

The COUNT(?code) is only to check the lines where osmId was not an Unique-ID.

Other simple solution to filter only the first option?

Using SAMPLE

As the @ValerioCocchi suggestion, we can use SAMPLE instead MAX:

SELECT ?code  ?item ?itemLabel (SAMPLE(?osmId) as ?osmId_sample)
WHERE
{
  ?item wdt:P297 ?code.
  OPTIONAL{?item wdt:P402 ?osmId .}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
GROUP BY ?code  ?item ?itemLabel 
ORDER BY ?code

try

SAMPLE use a little bit less CPU-time, but the main motivation to use is when you don’t care which value is returned. In the case of Wikidata, when the property-value is to be unique but there are some (minimal) errors, and you can ignore them.

NOTE about the osmId: the advantage of MAX in this particular query, using an numeric ID related to a temporal sequence, is that it can be a "fresher" ID... But in OpenStreetMap (OSM) the strategy can be the inverse: most old is the most stable ID. So, SAMPLE make sense also in a context of ignorance about better strategy.

Using FILTER

The @StanislavKralin suggestion:

SELECT ?code  ?item ?itemLabel ?osmId
WHERE 
{
  ?item wdt:P297 ?code.
  OPTIONAL{
    ?item wdt:P402 ?osmId 
    FILTER NOT EXISTS {
      ?item wdt:P402 ?osmId, ?osmId_ .
      FILTER (?osmId_ > ?osmId)
    }
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
ORDER BY ?code

try

Seems more verbose.

You can also look at using the `SAMPLE` function in SPARQL. – Valerio Cocchi Mar 06 '21 at 15:03 — Valerio Cocchi, Mar 06 '21 at 15:03

How to get only the first value from an optional property?

Real Wikidata case

1 Answers1

Using SAMPLE

Using FILTER