3

I don't know what's the encoding present in this phrases (and I'd like a answer to this too). Mainly, I'd like change my phrases to it.

For example:

Hello World! becomes Hello%20World!%0A

Olá mundo! becomes Ol%C3%A1%20mundo!%0A%0A

I'd like a python solution for this.

If I have

>>>Phrase='Olá mundo!'

How to change it to

>>>FinalPhrase

'Ol%C3%A1%20mundo!%0A%0A'

using Python?

Google uses it in your translation site, for example:

See:

http://translate.google.com.br/#en|pt|Hello%20World!%0A%0A

http://translate.google.com.br/#pt|en|Ol%C3%A1%20mundo!%0A

I need this use in Web applications and do urls to connect to some sites who uses this type of url.

Community
  • 1
  • 1
GarouDan
  • 3,743
  • 9
  • 49
  • 75
  • 1
    the keywords are `url encoding` or [`percent encoding`](http://tools.ietf.org/html/rfc3986#page-12) – jfs Nov 16 '11 at 18:57

1 Answers1

3
>>> import urllib2
>>> urllib2.quote
<function quote at 0x104a10848>
>>> urllib2.quote("ü")
'%C3%BC'
>>> urllib2.quote('Olá mundo!')
'Ol%C3%A1%20mundo%21'
Stefano Borini
  • 138,652
  • 96
  • 297
  • 431
  • Why using `>>>urllib2.quote("Olá mundo!")` I got `'Ol%C3%A1%20mundo%21'`, `%21` is `!`? Is there a way to keep like the Google mode? One of the sites I'd like to connect is the Google Translator. Is this encoding to transform all know symbols in url mode? What's this `%0A` who appears in the end but doesn't like doing nothing. – GarouDan Nov 16 '11 at 15:55
  • 2
    `%0A` is `\n`. And if you'll send `%21` instead of `!` you should not have any problems. Anyway, `quote()` has an optional second parameter which can contain symbols that should not be quoted. – wRAR Nov 16 '11 at 16:03
  • @StefanoBorini in another site I got **ol%E1%20mundo!** to the entry **Olá mundo!**, I tryed to use urllib instead urllib2 but I got the same result... How can I get the results in this form too? – GarouDan Nov 16 '11 at 23:12
  • I have another problem, using `urllib.urlencode({"palavra":urllib2.quote('Olá'),"lingua":"portugues-ingles"})`, I got `'palavra=Ol%25C3%25A1&lingua=portugues-ingles'` but there is this %25 and this screw my url...how can I fix? =-/ – GarouDan Nov 16 '11 at 23:29
  • 1
    @GarouDan: `urlencode` as the name suggests does url encoding so you should not call `quote()` in this case: `urllib.urlencode(dict(palavra=u'Olá'.encode('utf-8'), lingua='portugues-ingles'))`. You should set your source encoding if you use non-ascii characters in literal strings. – jfs Nov 17 '11 at 19:04
  • Very interesting @J.F.Sebastian thx. It works. Do you (or someone) know something how to transform `Olá mundo!` in `ol%E1%20mundo!`, I think with your answer and solving this another I can go ahead with my app^^. Thx. – GarouDan Nov 18 '11 at 21:41