-1

My problem is the following:

  1. I get from a google search query the following unicode string: "Playa de Porc%C3%ADa".
  2. I need to correctly convert "Playa de Porc%C3%ADa" into "Playa de Porcía" to pass that new string to a function that will do another search.
  3. The problem is that the accent "í" is cd ad in Unicode, I have tried to use decode() and encode() in several ways but can't get to the point.

Regards!

Edit: My code is in Python2.

Deepstop
  • 3,627
  • 2
  • 8
  • 21
javier
  • 33
  • 3
  • That's not a Unicode string. That's a percent-encoded string (also called URL-encoding). Unicode strings are those you see on every web site, including StackOverflow itself. They don't need special handling which is why I can write `Αυτό Εδώ` or `Playa de Porcía` and know that SO will display it properly without any encoding – Panagiotis Kanavos Jan 15 '19 at 14:13
  • 1
    Possible duplicate of [Url decode UTF-8 in Python](https://stackoverflow.com/questions/16566069/url-decode-utf-8-in-python) – Panagiotis Kanavos Jan 15 '19 at 14:24

1 Answers1

1

That's not a Unicode string. That's a percent-encoded string

for example %20 is a space symbol https://www.url-encode-decode.com/ - online url decoder

python 2 verison

import urllib2
print urllib2.unquote("Playa de Porc%C3%ADa")

python 3 verison

import urllib
urllib.parse.unquote("Playa de Porc%C3%ADa")

code for all versions

try:
    from urllib import unquote
except ImportError:
    from urllib.parse import unquote

print(unquote("Playa de Porc%C3%ADa"))

output

'Playa de Porcía'

https://docs.python.org/3/library/urllib.parse.html

frankegoesdown
  • 1,898
  • 1
  • 20
  • 38