Why does python convert my bytes-object into a int, when I put them in a new list?

Question

Intro

Hi, I'm doing some fun cryptoanalysis exercises from cryptopals, and I have now encountered an 'issue', that also has happened earlier, and I really don't understand.

currently, I have a ciphertext that I read from a file in the following way:

with open("6.txt", 'r') as infile:
        b64_encoded = infile.readlines()

ciphertext = b64decode('\n'.join([x.strip() for x in b64_encoded if x != ""]))

It's now a bytes objec, and looks like this when printed (this is just an excerpt):

b'\x1dB\x1fM\x0b\x0f\x02\x1fO\x13N<\x1aie\x1fI\x1c\x0eN\x13\x01\x0b\x07N\x1b\x01\x16E6\x00\x1e\x01Id T\x1d\x1dC3SNeR\x06\x00GT\x1c\rEM\x07\x04\x0cS\x12<\x0c\x1e\x08I\x1a\t\x11O\x14L!\x1aG+\x00\x05\x1dGY\x11\x04\t\x00d&\x07S\x007\x16\x06\x0c\x1a\x17A\x1d\x01RT0_\x00 \x13\n\x05GO\x12H\x08ENe>\x16\t8E\x06\x05\x08\x1aF\x07O\x1fYx~jb6\x0c\x1d\x0fA\rH\x06U\x1a\x1b\x00\x1dBt\x04\x1e\x01I\x1a\t\x11\x02Rz\x7fI\x00H:\x00\x1a\x13I\x1aOEH\x0f\x1d\rS\x04:\x01R\x19\x01\x0bA\x13\x06\x00L1_Sb\x15\x06\x07\t\x07T\x0b\x17A\x14\x16Iy35\x0b\x1b\x01\x05\x0fF\x07O\x1dNxNH\'R\x04\x07\x0cEXH\x08A\x00O T\x08t\x0b\x1d\x19I\x02\x00\x0e\x16\\\x00R0ie\x1fI\x02\x02T\x00\x01\x0b\x07N\x02\x10S\x01&\x10\x15M\x02\x07\x02\x1fO\x1bNx0i6R\n\x01\tT\x06\x07\tSN\x02\x10S\x08;\x10\x06\x05I\x0f\x0f\x10O;\x00:_G+\x1cId3OT\x02\ (...)

context

I have gotten to a point in the exercise, where I need to transpose this ciphertext, in accordance with a certain keylength that I know k, such that I can get a collection of strings where each string n, contains all the ciphertext characters that would have been encrypted with the n'th chararcter of the key.

That means that if I call my function transpose(ciphertext, keyLen) with arguments transpose("123456789", 3), then my output would be:

[['1', '4', '7'], ['2', '5', '8'], ['3', '6', '9']]

Problem

Ok, the transpose function I have made looks like this:

def transpose(string, n):
    buckets = [[] for i in range(n)]
    i = 0
    for c in string:
        buckets[i].append( c)
        i += 1
        if i > (n-1):
            i = 0
    return buckets

When I use it on a string, it works just as expected, and outputs the expected output for "123456789". But When I pass my 'bytes' ciphertext, then the output looks like this:

[[29, 54, 60, 55, 56, 116, 58, 53, 116, 38, 59, 116, 94, 58, 55, 49, 57, 57, 59, 61, 53, 34, 53, 58, 59, 116, 36, 32, 101, 48, 60, 51, 58, 116, 53, 58, 115, 116, 116, 49, 33, 13, 116, 60, 54, 59, 122, 59, 44, 53, 53, 116, 49, 50, 116, 45, 51, 49, 45, 59, 53, 38, 116, 94, 60, 48, 94, 59, 116, 49, 116, 54, 55, 116, 116, 58, 115, 33, 49, 59, 53, 58, 32, 49, 38, 53, 53, 45, 59, 116, 45, 61, 59, 94, 54, 32, 48, 55, 120, 39], [66, 0, 12, 22, 69, 4, 1, 11, 11, 16, 16, 12, 40, 66, 4, 69, 11, 0, 9, 11, 22, 0, 11, 66, 11, 10, 9, 69, 72, 69, 28, 12, 69, 111, 14, 69, 22, 13, 17, 23, 17, 10, 11, 69, 9, 11, 69, 8, 12, 69, 11, 44, 69, 23, 34, 12, 13, 23, 69, 69, 23, 10, 8, 54, 17, 69, 60, 18, 12, 4, 60, 10, 4, 4, 9, 69, 69, 23, 73, 8, 23, 28, 69, 69, 28, 28, 28, 69, 69, 111, 69, 0, 8, 53, 10, 13, 0, 73, 69, 12], [31, 30, 30, 6, 6, 30, 82, 27, 29, 21, 6, 6, 11, 94, 7, 120, 94, 82, 82, 85, 23, 82, 82, 82, 23, 20, 19, 7, 64, 120, 31, 30, 23, 59, 23, 95, 82, 19, 29, 94, 82, 7, 29, 31, 11, 83, 36, 27, 17, 5, 22, 17, 19, 23, 30, 28, 82, 23, 6, 26, 6, 7, 27, 2, 82, 31, 29, 28, 28, 0, 61, 22, 28, 28, 11, 17, 19, 23, 82, 23, 23, 6, 6, 22, 16, 82, 94, 21, 5, 62, 6, 92, 23, 30, 11, 19, 0, 82, 49, 17], [77, 1, 8, 12, 5, 1, 25, 1, 25, 77, 5, 77, 77, 77, 30, 44, 77, 103, 25, 77, 77, 0, 9, 29, 77, 11, 20, 29, 64, 43 (...)

And now my bytes have been converted to their integer representations? This does'nt really make sense to me, since all I am doing is to iterate through the bytes and place them in buckets. Why is it that these bytes are turned into integers if all you do is iterate over them?

``bytes`` *are* sequences of 1 byte *integers*, not of "individual bytes". — MisterMiyagi, Aug 26 '21 at 14:11
[PEP 358](https://www.python.org/dev/peps/pep-0358/) and [PEP 3137](https://www.python.org/dev/peps/pep-3137/) both say *that* ``bytes`` are sequences of integers, but do not give a rationale for it. — MisterMiyagi, Aug 26 '21 at 14:15

score 0 · Accepted Answer · answered Aug 26 '21 at 14:18

Ah, I remember this problem in cryptopals, I was facepalming when I understood how this trick works.

As MisterMiyagi said it, bytes is not a sequence of "bytes", but a sequence of ints.

If you index into an str, you get another str:

>>> type("abc"[0])
<class 'str'>

But with bytes:

>>> type(b"abc"[0])
<class 'int'>

So it's your for loop that 'converts' them to int. This can be directly done by list:

>>> list(b"abc")
[97, 98, 99]

But the reverse is also easily possible:

>>> bytes([97, 98, 99])
b'abc'

Why does python convert my bytes-object into a int, when I put them in a new list?

1 Answers1