1

I'm trying to use generated URLs like this one http://www.viaf.org/viaf/search?query=cql.any+=+%22Jean-Claude%20Moissinac%22&maximumRecords=5&httpAccept=application/json

but when using it with

# -*- encoding: utf-8 -*-
import urllib.request
# successful trial with the URI
urlQuery = u'http://www.viaf.org/viaf/search?query=cql.any%20=%20"Bacache%20Maya"&httpAccept=application%2Fjson&maximumRecords=5'
print(urlQuery)
req = urllib.request.Request(urlQuery)
with urllib.request.urlopen(req) as rep:
  print("success")

# attempt to build the URI; request fails
viafBaseUrl = u"http://www.viaf.org"
viafCommand = u"/viaf/search?"
viafSearchTemplate = u'"__name%20__surname"'
name = u"Bacache"
surname = u"Maya"
searchString = u'cql.any%20=%20' + viafSearchTemplate.replace(u"__surname", surname).replace(u"__name", name)
params = u"query="+searchString+u"&httpAccept=application%2Fjson&maximumRecords=5"
computedQuery = viafBaseUrl + viafCommand + params
print(urlQuery)
if computedQuery==urlQuery:
  print("same strings")
req = urllib.request.Request(computedQuery)
with urllib.request.urlopen(req) as rep:
  print("success")

The first request is successful, while the second fails with this error:

UnicodeEncodeError: 'ascii' codec can't encode character '\ufeff' in position 76: ordinal not in range(128)

I tried a lot of ways to work around the problem without success. Using urllib.parse.urlencode() fails because it changes some chars which must remains intact.

The result of the print on both url is identical, but the strings are different, but I don't understand how to get the same strings.

Bellerofont
  • 1,081
  • 18
  • 17
  • 16

2 Answers2

1

There is a hidden unicode character in the string application%2F between n and %. Just delete it and it should work.

Alden
  • 2,229
  • 1
  • 15
  • 21
0

In your second print statement you are accidentally referencing the first query urlQuery instead of computedQuery. There's an extra space in the computed query which becomes obvious after fixing the print statement.

Updated code below with the fixes and a couple comments:

# -*- encoding: utf-8 -*-
import urllib.request
# successful trial with the URI
urlQuery = u'http://www.viaf.org/viaf/search?query=cql.any%20=%20"Bacache%20Maya"&httpAccept=application%2Fjson&maximumRecords=5'
print(urlQuery)
req = urllib.request.Request(urlQuery)
with urllib.request.urlopen(req) as rep:
  print("success")

# attempt to build the URI; request fails
viafBaseUrl = u"http://www.viaf.org"
viafCommand = u"/viaf/search?"
viafSearchTemplate = u'"__name%20__surname"'
name = u"Bacache"
surname = u"Maya"
searchString = u'cql.any%20=%20' + viafSearchTemplate.replace(u"__surname", surname).replace(u"__name", name)
params = u"query="+searchString+u"&httpAccept=application%2Fjson&maximumRecords=5" # space after application deleted
computedQuery = viafBaseUrl + viafCommand + params
print(computedQuery) # was urlQuery
if computedQuery==urlQuery:
  print("same strings")
req = urllib.request.Request(computedQuery)
with urllib.request.urlopen(req) as rep:
  print("success")
nkconnor
  • 672
  • 3
  • 18