I have a string I would like to split into N equal parts.
For example, imagine I had a string with length 128 and I want to split it in to 4 chunks of length 32 each; i.e., first 32 chars, then the second 32 and so on.
How can I do this?
I have a string I would like to split into N equal parts.
For example, imagine I had a string with length 128 and I want to split it in to 4 chunks of length 32 each; i.e., first 32 chars, then the second 32 and so on.
How can I do this?
import textwrap
print(textwrap.wrap("123456789", 2))
#prints ['12', '34', '56', '78', '9']
Note: be careful with whitespace etc - this may or may not be what you want.
"""Wrap a single paragraph of text, returning a list of wrapped lines.
Reformat the single paragraph in 'text' so it fits in lines of no
more than 'width' columns, and return a list of wrapped lines. By
default, tabs in 'text' are expanded with string.expandtabs(), and
all other whitespace characters (including newline) are converted to
space. See TextWrapper class for available keyword args to customize
wrapping behaviour.
"""
You may use a simple loop:
parts = [your_string[i:i+n] for i in range(0, len(your_string), n)]
Another common way of grouping elements into n-length groups:
>>> s = '1234567890'
>>> list(map(''.join, zip(*[iter(s)]*2)))
['12', '34', '56', '78', '90']
This method comes straight from the docs for zip()
.
Recursive way:
def split_str(seq, chunk, skip_tail=False):
lst = []
if chunk <= len(seq):
lst.extend([seq[:chunk]])
lst.extend(split_str(seq[chunk:], chunk, skip_tail))
elif not skip_tail and seq:
lst.extend([seq])
return lst
Demo:
seq = "123456789abcdefghij"
print(split_str(seq, 3))
print(split_str(seq, 3, skip_tail=True))
# ['123', '456', '789', 'abc', 'def', 'ghi', 'j']
# ['123', '456', '789', 'abc', 'def', 'ghi']
You can treat a string similarly to a list in many cases. There are lots of answers here: Splitting a list of into N parts of approximately equal length
for example you could work out the chunk_size = len(my_string)/N
Then to access a chunk you can go my_string[i: i + chunk_size]
(and then increment i
by chunk_size) - either in a for loop or in a list comprehension.
I like iterators!
def chunk(in_string,num_chunks):
chunk_size = len(in_string)//num_chunks
if len(in_string) % num_chunks: chunk_size += 1
iterator = iter(in_string)
for _ in range(num_chunks):
accumulator = list()
for _ in range(chunk_size):
try: accumulator.append(next(iterator))
except StopIteration: break
yield ''.join(accumulator)
## DEMO
>>> string = "a"*32+"b"*32+"c"*32+"d"*32
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb', 'cccccccccccccccccccccccccccccccc', 'dddddddddddddddddddddddddddddddd']
>>> string += "e" # so it's not evenly divisible
>>> list(chunk(string,4))
['aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab', 'bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbcc', 'ccccccccccccccccccccccccccccccddd', 'ddddddddddddddddddddddddddddde']
Also demonstrably faster than textwrap.wrap
, although almost certainly less "good"
>>> timeit.timeit(lambda: list(chunk(string,4)),number=500)
0.047726927170444355
>>> timeit.timeit(lambda: textwrap.wrap(string,len(string)//4),number=500)
0.20812756575945457
And pretty easy to hack to work with any iterable (just drop the str.join
and yield accumulator unless isinstance(in_string,str)
)
# after a petty hack
>>> list(chunk([1,2,3,4,5,6,7,8,9,10,11,12],4))
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]