1

I am looking through some unit testing code and I found this:

self.assertIn(b'Hello', res.body)

I know that this means bytes in Python 3 which returns a byte array, as I found here. I believe that the code was written for Python 3.3 and am trying to figure out how it works in other versions (in my case 2.7) The related question that I found had a poorly-written accepted answer with contradictory comments that confused me.

Questions:

  • In what versions of python does b'myString' "work"?
  • How does it behave in python 2.x?
  • How does it behave in python 3.x?
  • Does that have something to do with the byte literal change?
Community
  • 1
  • 1
Stephan
  • 16,509
  • 7
  • 35
  • 61
  • What's contradictory about the comments to that answer? It's just the answerer and another user helping each other out looking for reference links to back up the very simple answer. (Section 2.4.1 of the reference docs had a bug in 2.6, but it was fixed; see [here](http://docs.python.org/2.7/reference/lexical_analysis.html#string-literals).) – abarnert Jul 17 '13 at 21:37
  • And what part of the answer is unclear? It specifically says it's a bytes literal in 3.x, it's not legal at all in 2.5 and older, and it's equivalent to a plain string in 2.6+ for future compatibility. The only thing remotely confusable is whether 2.6+ includes 3.x (it doesn't; that was already covered in the first part). – abarnert Jul 17 '13 at 21:38
  • Finally, `bytes` does _not_ return a `bytearray` in 3.x. Those are two separate but related types, similar to `frozenset` vs. `set` or `tuple` vs. `list`. – abarnert Jul 17 '13 at 21:40
  • The answer uses the word "it" a total of two times. One of them is inside a parenthetical aside that you can completely ignore. The other, I can't imagine what "it" could possibly refer to besides either "bytes literal" or "this prefix", and it makes perfect sense either way. – abarnert Jul 17 '13 at 22:12
  • 1
    More importantly, if you think an answer is confusing, comment on the answer; don't create a new question asking for comments on the answer to another question. – abarnert Jul 17 '13 at 22:12

1 Answers1

6

This is all described in the document you linked.

  • In what versions of python does b'myString' "work"?: 2.6+.
  • How does it behave in python 2.x? It creates a bytes literal—which is the exact same thing as a str literal in 2.x.
  • How does it behave in python 3.x? It creates a bytes literal—which is not the same thing as a str literal in 3.x.
  • Does that have something to do with the byte literal change? Yes. That's the whole point; it lets you write "future compatible" code—or code that works in both 2.6+ and 3.0+ without 2to3.

Quoting from the first paragraph in the section you linked:

For future compatibility, Python 2.6 adds bytes as a synonym for the str type, and it also supports the b'' notation.

Note that, as it says a few lines down, Python 2.x bytes/str is not exactly the same type as Python 3.x bytes: "most notably, the constructor is completely different". But bytes literals are the same, except in the edge case where you're putting Unicode characters into a bytes literal (which has no defined meaning in 2.x, but does something arbitrary that may sometimes happen to be what you'd hoped, while in 3.x it's a guaranteed SyntaxError).

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Is there no `bytes` literal equivalent in Python 2.x? I'm working on making a script that was written for Python 3, Python 2.7 compatible. I ran `3to2` on it but still doesn't work so I'm going in and doing it myself. When I read in a certain kind of file header I get statements like: `b'DIGLABELdigin\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00'` in Python 3 but the come up as weird symbols in Python 2.7. they look like little 2x2 matrices but they're characters. When I past them into a txt editor they come up as new lines and stuff like that – G Warner Mar 11 '15 at 19:17
  • In Python 2.7, bytes and str are the same type. You can use the 'b' prefix, but it has no effect. (Similarly, in Python 3.4, you can use the 'u' prefix, but it has no effect.) So you most likely already have the right thing; the issue is just that you're trying to print it to the console, and Python 2.x and 3.x do that differently, so you see, say, the cp1252 character for 0x01 in 2.x, but the escaped string `\x01` in 3.x. If you want to ensure that you see escaped strings in both, explicitly `print(s.decode('unicode-escape'))`. – abarnert Mar 12 '15 at 20:40