Python 2 vs 3: consistent results with getting a byte from byte string

Question

Is there any simple way to get consistent results in both Python 2 and Python 3 for operatioIn like "give me N-th byte in byte string"? Getting either byte-as-integer or byte-as-character will do for me, as long as that will be consistent.

I.e. given

s = b"123"

Naïve approach yields:

s[1] # => Python 2: '2', <type 'str'>
s[1] # => Python 3: 50, <class 'int'>

Wrapping that in ord(...) yields an error in Python 3:

ord(s[1]) # => Python 2: 50, <type 'int'> 
ord(s[1]) # => Python 3: TypeError: ord() expected string of length 1, but int found

I can think of a fairly complicated compat solution:

ord(s[1]) if (type(s[1]) == type("str")) else s[1] # 50 in both Python 2 and 3

... but may be there's an easier way which I just don't notice?

Related [What does the 'b' character do in front of a string literal?](https://stackoverflow.com/questions/6269765/what-does-the-b-character-do-in-front-of-a-string-literal) — Guy, Oct 16 '19 at 10:40
You *are* aware that support for 2.x is about to be officially dropped, yes? — Karl Knechtel, Oct 16 '19 at 11:01
@KarlKnechtel Yes, and I'm totally aware of tons of our users potentially using Python 2, so unless there's something very dire, I'd like to keep our Python 2 support too. — GreyCat, Oct 16 '19 at 14:14
Ouch, sorry to hear about that. I wish you the best of luck in avoiding people yelling at you about it in the future. Maintenance is never fun IMX. — Karl Knechtel, Oct 17 '19 at 03:12

score 3 · Accepted Answer · answered Oct 16 '19 at 11:00

3

A length-1 slice will be also be a byte-sequence in either 2.x or 3.x:

s = b'123'
s[1:2] # 3.x: b'2'; 2.x: '2', which is the same thing but the repr() rules are different.

answered Oct 16 '19 at 11:00

Karl Knechtel

62,466
11
102
153

Thanks, that seems to be the most elegant choice in my case — no bulky code and no extra libraries to require! – GreyCat Oct 16 '19 at 14:11

ShadowRanger · Answer 2 · 2019-10-16T23:13:15.490

If you use (converting if needed) the bytearray type, behavior will be identical on both versions, always matching Python 3 behavior of bytes. That's because bytearray is actually a distinct type on Python 2 (with Python 3 behavior), where bytes is just an alias for str there.

The more typical solution would be to use the six compatibility library, which provides six.indexbytes, so on either version of Python, you could do:

>>> six.indexbytes(s, 1)
50

norok2 · Answer 3 · 2019-10-16T11:14:01.577

0

What about something like this?

import sys

if sys.version_info.major == 3:
    def index(s, n):
        return s[n]
elif sys.version_info.major == 2:
    def index(s, n):
        return ord(s[n])
else:
    raise NotImplementedError

edited Oct 16 '19 at 11:14

answered Oct 16 '19 at 10:58

norok2

25,683
4
73
99

4

For the record, this is *exactly* what [`six.indexbytes`](https://six.readthedocs.io/#six.indexbytes) does. – ShadowRanger Oct 16 '19 at 11:08
Definitely, `six` is a nice arrow in your quiver when writing Python 2/3 code. – norok2 Oct 16 '19 at 11:09
1

Looks like a recipe for bugs if Python ever goes to v4. – Holloway Oct 16 '19 at 11:12
@Holloway Fixed. But TBH I hardly think that it was relevant for the problem at hand. – norok2 Oct 16 '19 at 11:15
The reason for my question is exactly to *avoid* rolling such a lengthy construct myself every time I need it. Using `six.indexbytes` or, better yet, length slice approach suggested in https://stackoverflow.com/a/58411754/487064 is what I was looking for. – GreyCat Oct 16 '19 at 14:15

Python 2 vs 3: consistent results with getting a byte from byte string

3 Answers3