I'm trying to use generated URLs like this one http://www.viaf.org/viaf/search?query=cql.any+=+%22Jean-Claude%20Moissinac%22&maximumRecords=5&httpAccept=application/json
but when using it with
# -*- encoding: utf-8 -*-
import urllib.request
# successful trial with the URI
urlQuery = u'http://www.viaf.org/viaf/search?query=cql.any%20=%20"Bacache%20Maya"&httpAccept=application%2Fjson&maximumRecords=5'
print(urlQuery)
req = urllib.request.Request(urlQuery)
with urllib.request.urlopen(req) as rep:
print("success")
# attempt to build the URI; request fails
viafBaseUrl = u"http://www.viaf.org"
viafCommand = u"/viaf/search?"
viafSearchTemplate = u'"__name%20__surname"'
name = u"Bacache"
surname = u"Maya"
searchString = u'cql.any%20=%20' + viafSearchTemplate.replace(u"__surname", surname).replace(u"__name", name)
params = u"query="+searchString+u"&httpAccept=application%2Fjson&maximumRecords=5"
computedQuery = viafBaseUrl + viafCommand + params
print(urlQuery)
if computedQuery==urlQuery:
print("same strings")
req = urllib.request.Request(computedQuery)
with urllib.request.urlopen(req) as rep:
print("success")
The first request is successful, while the second fails with this error:
UnicodeEncodeError: 'ascii' codec can't encode character '\ufeff' in position 76: ordinal not in range(128)
I tried a lot of ways to work around the problem without success.
Using urllib.parse.urlencode()
fails because it changes some chars which must remains intact.
The result of the print on both url is identical, but the strings are different, but I don't understand how to get the same strings.