10000 row DBpedia Query result set size limit

Question

This is my first time playing with SPARQL. I have created a query below but only getting the first 10000 results. How can I get all results from DBpedia?

    from SPARQLWrapper import SPARQLWrapper, JSON
sparql = SPARQLWrapper("http://dbpedia.org/sparql")
sparql.setQuery("""
    PREFIX dbpedia0: <http://dbpedia.org/ontology/>
    PREFIX dbpedia2: <http://dbpedia.org/property/>
    SELECT str(?song) as ?song str(?artist) as ?artist str(?genre) as ?genre WHERE {
    ?song a dbpedia0:Single.
    ?song dbpedia0:genre ?genre.
    ?song dbpedia0:musicalArtist ?artist
    }

    ORDER BY ?genre 
    """)
print '\n\n*** JSON Example'
sparql.setReturnFormat(JSON)
results = sparql.query().convert()
for result in results["results"]["bindings"]:
    print result["genre"]["value"].replace("http://dbpedia.org/resource/", "") +"\t\t"+result["artist"]["value"].replace("http://dbpedia.org/resource/", "")+"\t\t"+result["song"]["value"].replace("http://dbpedia.org/resource/", "")

I found something re: OFFSET and LIMIT, but I am not sure how to use it to get ALL results.

You can't **remove** the default limit set by the public DBpedia service. You can workaround it by doing some kind of pagination, thus, doing queries `OFFSET 10000`, `OFFSET 20000`, and so on and so furth until the resultset is empty. For correctness, this workaround would also need `ORDER BY` but it's pretty expensive. — UninformedUser, May 25 '18 at 10:52
By the way, instead of doing this string replacement hack `.replace("http://dbpedia.org/resource/", "")` - the "better" way would be to get the English `rdfs:label` of the resources as those labels are supposed to provide a human readable form. — UninformedUser, May 25 '18 at 10:54
First of all thankyou for both comments as they are extremely helpful. Re: OFFSET therefore a loop is required until all resultset is empty? — joe borg, May 25 '18 at 10:57
Short answer: yes, a loop inside your client code that increases the `OFFSET` value by the default limit of the SPARQL endpoint - in your case `10 000` — UninformedUser, May 25 '18 at 11:18
Well, simply `?song rdfs:label ?songLabel .` or what do you mean? And don't forget to select the corresponding variable and add a language filter for e.g. English. — UninformedUser, May 25 '18 at 13:08
for the song it is easy but for the genre and artist I am finding it hard! :( Is there a tutorial anywhere showing this label thing? — joe borg, May 25 '18 at 13:11
What kind of tutorial should this be? It's always the same, add a triple pattern that matches additional data. What's wrong with adding `?genre rdfs:label ?genreLabel .`? — UninformedUser, May 25 '18 at 13:36
This and other public endpoint limits and restrictions are [discussed on the DBpedia website](https://wiki.dbpedia.org/public-sparql-endpoint). — TallTed, Jun 08 '18 at 21:46

10000 row DBpedia Query result set size limit

0 Answers0