4

I am trying to query DBpedia using SPARQLWrapper in Python (v3.3). This is my query:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?slot WHERE {
  <http://dbpedia.org/resource/Week> <http://www.w3.org/2002/07/owl#sameAs> ?slot
}

It results in an error from the SPARQLWrapper package:

ValueError: Invalid \escape: line 118 column 74 (char 11126)

Code:

query = "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?slot WHERE{{ {subject} {predicate} {object} }} "

query = query.format(subject=subject, predicate=predicate, object= objectfield)
        self.sparql.setQuery(query)

        self.sparql.setReturnFormat(JSON)


    results = self.sparql.query().convert() # Error thrown at this line 

Error :

Traceback (most recent call last):
  File "getUriLiteralAgainstPredicate.py", line 84, in <module>
    sys.exit(main())
  File "getUriLiteralAgainstPredicate.py", line 61, in main
    entity,predicateURI,result = p.getObject(dataAtURI,predicates, each["entity"])
  File "getUriLiteralAgainstPredicate.py", line 30, in getObject
    result = self.run_sparql("<"+subjectURI+">","<"+predicateURI+">","?slot")
  File "getUriLiteralAgainstPredicate.py", line 24, in run_sparql
    results = self.sparql.query().convert()
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/SPARQLWrapper-1.5.2-py3.3.egg/SPARQLWrapper/Wrapper.py", line 539, in convert
    return self._convertJSON()
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/SPARQLWrapper-1.5.2-py3.3.egg/SPARQLWrapper/Wrapper.py", line 476, in _convertJSON
    return jsonlayer.decode(self.response.read().decode("utf-8"))
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/SPARQLWrapper-1.5.2-py3.3.egg/SPARQLWrapper/jsonlayer.py", line 76, in decode
    return _decode(string)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages/SPARQLWrapper-1.5.2-py3.3.egg/SPARQLWrapper/jsonlayer.py", line 147, in <lambda>
    _decode = lambda string, loads=json.loads: loads(string)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/decoder.py", line 352, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/json/decoder.py", line 368, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid \escape: line 118 column 74 (char 11126)
Ankit Solanki
  • 670
  • 2
  • 9
  • 23
  • Are you trying to load the pages? – aIKid Nov 08 '13 at 16:27
  • Please show your python code. Your query is legal, according to sparql.org's validator, and it only has one line, so if there are 118 lines of something, it's probably the results, but we can't see what you're trying to do with them. Also, please show the error message verbatim (i.e., copy and paste it here). It could be that the DBpedia results are being read in an unexpected way. How are you making the query? – Joshua Taylor Nov 08 '13 at 19:43
  • 1
    Thank you for posting some of the code, but that's not enough for anyone to, e.g., copy and paste it so as to reproduce the problem locally. Please post a minimal working example as per the "Questions concerning problems with code you've written must describe the specific problem — and include **valid code to reproduce it** — in the question itself. See http://SSCCE.org for guidance." – Joshua Taylor Nov 09 '13 at 13:48
  • Even this code does help a bit, though, since now we know something we didn't before: your result set is coming back in JSON, and the "Invalid \escape: line 118 column 74 (char 11126)" is occurring while decoding the JSON (json/decoder.py). – Joshua Taylor Nov 09 '13 at 13:52
  • 1
    I'm the maintainer of the library, I'd be happy to help you. But the problem decoding the json response is hard to evaluate without knowing what kind of raw json is causing the issue. But, with the current evidences, I'd say is a server-side issue (wrong json generated) than a client side (in the sparqlwrapper itself). – wikier Nov 15 '13 at 14:11
  • Hello Wikier, Thanks for your reply. Now the problem at hand needs to solved at the client side even if the issue is at server side as one cant make any changes at dbpedia. I have been trying to catch the result set in an exception block and see if I can remove, the characters causing the exception and move ahead but then now the problem comes at wrapping it again into a result set object. any suggestions you have to solve his problem ? – Ankit Solanki Dec 02 '13 at 09:36

2 Answers2

5

The problem is, that dbpedia output has this line:

{ "slot": { "type": "uri", "value": "http://got.dbpedia.org/resource/\U00010345\U00010339\U0001033A\U00010349" }},

Notice literals which start with \U (capital U). This is not valid JSON and python doesn't know how to handle it. So, problem is on DBPedia side and it can't be handled on SPARQLWrapper side.

But… You can handle it yourself like this:

results = self.sparql.query()
body = results.response.read()

fixed_body = body.decode("unicode_escape")

from SPARQLWrapper.Wrapper import jsonlayer
results = jsonlayer.decode(fixed_body)
JimiDini
  • 2,039
  • 12
  • 19
4

try python-cjson

so the above thing can also be tried as below

import cjson
results = self.sparql.query()
body = results.response.read()
results = cjson.decode(body)
Gunjan
  • 2,775
  • 27
  • 30