26

Going through the string.translate function which says:

Delete all characters from s that are in deletechars (if present), and then translate the characters using table, which must be a 256-character string giving the translation for each character value, indexed by its ordinal. If table is None, then only the character deletion step is performed.

  • What does table mean here? Can it be a dict containing the mapping?
  • What does "must be a 256-character string" mean?
  • Can the table be made manually or through a custom function instead of string.maketrans?

I tried using the function (attempts below) just to see how it worked but wasn't successfully able to use it.

>>> "abcabc".translate("abcabc",{ord("a"): "d", ord("c"): "x"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: translation table must be 256 characters long
>>> "abcabc".translate({ord("a"): ord("d"), ord("c"): ord("x")}, "b")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

>>> "abc".translate({"a": "d", "c": "x"}, ["b"])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: expected a character buffer object

What am I missing here?

Elias Zamaria
  • 96,623
  • 33
  • 114
  • 148
Bleeding Fingers
  • 6,993
  • 7
  • 46
  • 74

3 Answers3

24

It depends on Python version you are using.

In Python 2.x. The table is 256-characters string. It can be created using string.maketrans:

>>> import string
>>> tbl = string.maketrans('ac', 'dx')
>>> "abcabc".translate(tbl)
'dbxdbx'

In Python 3.x, the table is mapping of unicode ordinals to unicode characters.

>>> "abcabc".translate({ord('a'): 'd', ord('c'): 'x'})
'dbxdbx'
falsetru
  • 357,413
  • 63
  • 732
  • 636
  • 5
    Python 2 `unicode.translate()` behaves exactly like `str.translate()` in Python 3. That's because you have way more than 256 possible values to translate. Inversely, `bytes.translate()` works exactly like Python 2 `str.translate()`. So it does not depend on the Python version, it depends on the object type; Unicode vs bytestring. – Martijn Pieters Jul 04 '17 at 21:41
10

table must be a string of 256 characters; the str.translate() method uses this table to map the byte value (a number between 0 and 255) to a new character; e.g. any character 'a' (a byte with the integer value 97) is replaced with the 98th character in the table.

You really want to refer to the str.translate() documentation for all this, not the string.translate() function; the latter documentation is not as complete.

You can build one using string.maketrans function; you give it just the characters you want to replace with the characters that replace these; for your example, that's:

>>> import string
>>> table = string.maketrans('ac', 'cx')
>>> len(table)
256
>>> table[97]
'c'
>>> 'abcabc'.translate(table, 'b')
'cxcx'

The second argument is also supposed to be a string.

You appear to have read the documentation for the unicode.translate() method; behaviour changed and you indeed have to pass in a dictionary for unicode.translate(). Since the Python 2 unicode type is the str type in Python 3, that's also how you'd use str.translate() in Python 3 (where bytes.translate() matches the above behaviour).

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0

To translate text, not using a dictionary {ordinal: char}, but a dictionary {char: char} (e.g. {'a': 'X', 'J': 'y', ...}:

text.translate({ord(k):dictionary[k] for k in dictionary})