-1

I want to represent all characters in a string as in this table.

But when I do

raw = 'æøå'
encoded = raw.encode('cp1252')
print(encoded)

I get

>>> b'\xe6\xf8\xe5'

What I want is

>>> %E6%F8%E5

as a string for use in a URL.

sinB
  • 185
  • 1
  • 11
  • 1
    There's no such thing. 1252 is the Latin codepage. *URLs* though have their *own* encoding, unrelated to codepages. You are asking how to URL-encode that string. – Panagiotis Kanavos Nov 12 '18 at 11:13
  • 1
    @PanagiotisKanavos: Latin-1 is a different standard. CP-1252 differs from that standard, don't equate the two. You are completely right about this not being CP1252 encoded output, of course. – Martijn Pieters Nov 12 '18 at 11:18
  • @MartijnPieters yes, I know but I'm tired of writing an entire article to describe encodings in comments for the Nth time. The OP is still asking the wrong thing, confusing character codepages for URL encoding – Panagiotis Kanavos Nov 12 '18 at 11:18
  • @PanagiotisKanavos: absolutely. And `urllib.parse.quote()` takes care of encoding for you. – Martijn Pieters Nov 12 '18 at 11:20

1 Answers1

3

You have to "quote" your string using urllib tools.

import urllib.parse

raw = 'æøå'
print(urllib.parse.quote(raw, encoding='cp1252'))
# returns "%E6%F8%E5"
Antwane
  • 20,760
  • 7
  • 51
  • 84
  • You do not need to encode separately. `urlib.parse.quote()` takes an encoding argument directly. – Martijn Pieters Nov 12 '18 at 11:20
  • 1
    Use `urllib.parse.quote(raw, encoding='cp1252')`, skip the `raw.encode()` call altogether. – Martijn Pieters Nov 12 '18 at 11:21
  • 1
    @MartijnPieters Thanks for the information, I just discovered this. Answer updated – Antwane Nov 12 '18 at 11:22
  • Sometimes you do need to handle `bytes`, at which point I'd recommend you use [`urllib.parse.quote_from_bytes()`](https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_from_bytes), to be explicit about what is being done. Also, the OP probably wants to use `quote_plus()`, not `quote()`, as the vast majority of these want the [`application/x-www-form-urlencoded` content type variant](https://en.wikipedia.org/wiki/Percent-encoding#The_application/x-www-form-urlencoded_type). – Martijn Pieters Nov 12 '18 at 11:25