3

Is there any neat trick to slice a binary number into groups of five digits in python?

'00010100011011101101110100010111' => ['00010', '00110', '10111', ... ]

Edit: I want to write a cipher/encoder in order to generate "easy to read over the phone" tokens. The standard base32 encoding has the following disadvantages:

  • Potential to generate accidental f*words
  • Uses confusing chars like chars like 'I', 'L', 'O' (may be confused with 0 and 1)
  • Easy to guess sequences ("AAAA", "AAAB", ...)

I was able to roll my own in 20 lines of python, thanks everybody. My encoder leaves off 'I', 'L', 'O' and 'U', and the resulting sequences are hard to guess.

Paulo Scardine
  • 73,447
  • 11
  • 124
  • 153

7 Answers7

6
>>> a='00010100011011101101110100010111'
>>> [a[i:i+5] for i in range(0, len(a), 5)]
['00010', '10001', '10111', '01101', '11010', '00101', '11']
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
6
>>> [''.join(each) for each in zip(*[iter(s)]*5)]
['00010', '10001', '10111', '01101', '11010', '00101']

or:

>>> map(''.join, zip(*[iter(s)]*5))
['00010', '10001', '10111', '01101', '11010', '00101']

[EDIT]

The question was raised by Greg Hewgill, what to do with the two trailing bits? Here are some possibilities:

>>> from itertools import izip_longest
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=''))
['00010', '10001', '10111', '01101', '11010', '00101', '11']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue=' '))
['00010', '10001', '10111', '01101', '11010', '00101', '11   ']
>>>
>>> map(''.join, izip_longest(*[iter(s)]*5, fillvalue='0'))
['00010', '10001', '10111', '01101', '11010', '00101', '11000']
pillmuncher
  • 10,094
  • 2
  • 35
  • 33
  • I will just pad it first to the length in bits of the maximum value (which is a multiple of 5, so there will never be any trailing bits). – Paulo Scardine Aug 09 '13 at 05:48
1

My question was duplicated by this one, so I would answer it here.

I got a more general and memory efficient answer for all this kinds of questions using Generators

from itertools import islice
def slice_generator(an_iter, num):
    an_iter = iter(an_iter)
    while True:
        result = tuple(islice(an_iter, num))
        if not result:
           return
        yield result

So for this question, We can do:

>>> l = '00010100011011101101110100010111'
>>> [''.join(x) for x in slice_generator(l,5)]
['00010', '10001', '10111', '01101', '11010', '00101', '11']
Community
  • 1
  • 1
Xiao
  • 12,235
  • 2
  • 29
  • 36
1

Another way to group iterables, from the itertools examples:

def grouper(n, iterable, fillvalue=None):
    "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)
David Wolever
  • 148,955
  • 89
  • 346
  • 502
1

Per your comments, you actually want base 32 strings.

>>> import base64
>>> base64.b32encode("good stuff")
'M5XW6ZBAON2HKZTG'
Allen
  • 5,034
  • 22
  • 30
1

How about using a regular expression?

>>> import re
>>> re.findall('.{1,5}', '00010100011011101101110100010111')
['00010', '10001', '10111', '01101', '11010', '00101', '11']

This will break though if your input string contains newlines, that you want in the grouping.

Peter Gibson
  • 19,086
  • 7
  • 60
  • 64
0
>>> l = '00010100011011101101110100010111'
>>> def splitSize(s, size):
...     return [''.join(x) for x in zip(*[list(s[t::size]) for t in range(size)])]
...  
>>> splitSize(l, 5)
['00010', '10001', '10111', '01101', '11010', '00101']
>>> 
pyfunc
  • 65,343
  • 15
  • 148
  • 136