How to convert 'binary string' to normal string in Python3?

Question

For example, I have a string like this(return value of subprocess.check_output):

>>> b'a string'
b'a string'

Whatever I did to it, it is always printed with the annoying b' before the string:

>>> print(b'a string')
b'a string'
>>> print(str(b'a string'))
b'a string'

Does anyone have any ideas about how to use it as a normal string or convert it into a normal string?

@HanfeiSun what you call a "*binary string*" is a **bytes object** (see [information about *bytes object* in the standard library](https://docs.python.org/3/library/stdtypes.html#typebytes) ) — loved.by.Jesus, Jan 06 '20 at 13:47

score 606 · Accepted Answer · answered Jul 12 '13 at 12:55

606

Decode it.

>>> b'a string'.decode('ascii')
'a string'

To get bytes from string, encode it.

>>> 'a string'.encode('ascii')
b'a string'

answered Jul 12 '13 at 12:55

falsetru

357,413
63
732
636

37

@lyomi, I used `ascii` because the given string was made with ascii letters. You don't need to specify encoding if the encoding is `utf-8` (default in Python 3.x according to `str.encode`, `bytes.decode` doc-string) – falsetru Mar 30 '16 at 08:28
2

@lyomi In 2016 (and its nearly the end) people still use ascii. There are many many 'legacy' products and systems (including specifications), but there are also lots of reasons why you might be creating a 'binary string' where you don't want unicode or something to try and 'merge' multiple bytes into a single character. We often use 'strings' to contain binary data for instance making DNS requests etc. – Jmons Sep 23 '16 at 11:55
I suggest to add the following to complete the answer. Most times we need to decode bytes from our operating system, such as console output, the most pythonic way I found to do it is to `import locale` and then `os_encoding = locale.getpreferredencoding()`. This way, we can decode using `my_b_string.decode(os_encoding)` – aturegano Jul 27 '17 at 15:10
2

@aturegano, It's not the only option. `sys.getfilesystemencoding()`, `sys.stdin.encoding`, `sys.stdout.encoding`. IMHO, using those automatic encoding detection could solve problem because the sub-program (OP is using subprocess) could be written other way to determine encoding (or even hard-coded). Thanks for feedback, anyway. – falsetru Jul 28 '17 at 09:02
@falsetru Note that `sys.getfilesystemencoding()` returns the name of the encoding used to convert between Unicode filenames and bytes filenames and is strongly dependant on operating system you are using. AFAIK, this function is used to convert to the system’s preferred representation. That means that it will not infer the codification used by the console that can be obtained using the aforementioned `locale.getpreferredencoding()` function – aturegano Jul 28 '17 at 11:42
@aturegano, subprocess.check_output can return arbitrary byte string, it could be a file system path. You know my point, right? – falsetru Jul 28 '17 at 13:24
@lyomi long live ascii – micah May 14 '18 at 16:45

score 126 · Answer 2 · edited May 23 '17 at 12:26

126

If the answer from falsetru didn't work you could also try:

>>> b'a string'.decode('utf-8')
'a string'

edited May 23 '17 at 12:26

Community

1
1

answered Mar 11 '16 at 19:30

kame

20,848
33
104
159

score 11 · Answer 3 · answered Jun 02 '20 at 17:52

11

Please, see oficial encode() and decode() documentation from codecs library. utf-8 is the default encoding for the functions, but there are severals standard encodings in Python 3, like latin_1 or utf_32.

answered Jun 02 '20 at 17:52

Daniel Argüelles

2,229
1
33
56

How to convert 'binary string' to normal string in Python3?

3 Answers3

Linked

Related