74

I have a string I would like to split into N equal parts.

For example, imagine I had a string with length 128 and I want to split it in to 4 chunks of length 32 each; i.e., first 32 chars, then the second 32 and so on.

How can I do this?

Air
  • 8,274
  • 2
  • 53
  • 88
Mo.
  • 40,243
  • 37
  • 86
  • 131
  • Related: http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python – poke Mar 21 '14 at 23:58

6 Answers6

94
import textwrap
print(textwrap.wrap("123456789", 2))
#prints ['12', '34', '56', '78', '9']

Note: be careful with whitespace etc - this may or may not be what you want.

"""Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
    """
Rusty Rob
  • 16,489
  • 8
  • 100
  • 116
71

You may use a simple loop:

parts = [your_string[i:i+n] for i in range(0, len(your_string), n)]
Tim Zimmermann
  • 6,132
  • 3
  • 30
  • 36
35

Another common way of grouping elements into n-length groups:

>>> s = '1234567890'
>>> list(map(''.join, zip(*[iter(s)]*2)))
['12', '34', '56', '78', '90']

This method comes straight from the docs for zip().

anon582847382
  • 19,907
  • 5
  • 54
  • 57
  • 3
    This is much better than mine! Mind if I steal this for future use? :) – Adam Smith Mar 21 '14 at 23:55
  • 5
    Note that _"The returned list is truncated in length to the length of the shortest argument sequence"_ so if the data series cannot fit into equal groups of n-length any extra will be left off. – None Jun 21 '15 at 17:46
  • That `list` call looks redundant; why did you add it? – Air Sep 10 '15 at 20:29
  • 1
    @Air The 'map' function returns a list in python 2 and a generator object in python 3, which would still work. However, I thought a list would make for a better output for the purposes of this answer. – anon582847382 Sep 11 '15 at 21:23
  • Perfect!, converting from `ip6.arpa.` string using `':'.join(map(''.join, zip(*[reversed(qn.split('.'))]*4)))` – NiKiZe Dec 18 '21 at 06:59
7

Recursive way:

def split_str(seq, chunk, skip_tail=False):
    lst = []
    if chunk <= len(seq):
        lst.extend([seq[:chunk]])
        lst.extend(split_str(seq[chunk:], chunk, skip_tail))
    elif not skip_tail and seq:
        lst.extend([seq])
    return lst

Demo:

seq = "123456789abcdefghij"

print(split_str(seq, 3))
print(split_str(seq, 3, skip_tail=True))

# ['123', '456', '789', 'abc', 'def', 'ghi', 'j']
# ['123', '456', '789', 'abc', 'def', 'ghi']
Omid Raha
  • 9,862
  • 1
  • 60
  • 64
  • What is the max length of the sequence? I'm receiving: "RecursionError: maximum recursion depth exceeded while calling a Python object" with ~25k chars. – hi im vinzent Dec 18 '18 at 21:28
6

You can treat a string similarly to a list in many cases. There are lots of answers here: Splitting a list of into N parts of approximately equal length

for example you could work out the chunk_size = len(my_string)/N

Then to access a chunk you can go my_string[i: i + chunk_size] (and then increment i by chunk_size) - either in a for loop or in a list comprehension.

Community
  • 1
  • 1
Rusty Rob
  • 16,489
  • 8
  • 100
  • 116
6

I like iterators!

def chunk(in_string,num_chunks):
    chunk_size = len(in_string)//num_chunks
    if len(in_string) % num_chunks: chunk_size += 1
    iterator = iter(in_string)
    for _ in range(num_chunks):
        accumulator = list()
        for _ in range(chunk_size):
            try: accumulator.append(next(iterator))
            except StopIteration: break
        yield ''.join(accumulator)

## DEMO
>>> string = "a"*32+"b"*32+"c"*32+"d"*32
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccccccccc', 'dddddddddddddddddddddddddddddddd']
>>> string += "e" # so it's not evenly divisible
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcc', 'ccccccccccccccccccccccccccccccddd', 'ddddddddddddddddddddddddddddde']

Also demonstrably faster than textwrap.wrap, although almost certainly less "good"

>>> timeit.timeit(lambda: list(chunk(string,4)),number=500)
0.047726927170444355
>>> timeit.timeit(lambda: textwrap.wrap(string,len(string)//4),number=500)
0.20812756575945457

And pretty easy to hack to work with any iterable (just drop the str.join and yield accumulator unless isinstance(in_string,str))

# after a petty hack
>>> list(chunk([1,2,3,4,5,6,7,8,9,10,11,12],4))
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
Adam Smith
  • 52,157
  • 12
  • 73
  • 112