Split string every nth character

Question

How do I split a string every nth character?

'1234567890'   →   ['12', '34', '56', '78', '90']

_{For the same question with a list, see How do I split a list into equally-sized chunks?.}

score 779 · Answer 1 · answered Feb 28 '12 at 02:02

779

>>> line = '1234567890'
>>> n = 2
>>> [line[i:i+n] for i in range(0, len(line), n)]
['12', '34', '56', '78', '90']

answered Feb 28 '12 at 02:02

satomacoto

11,349
2
16
13

1

@TrevorRudolph It only does exactly what you tell it. The above answer is really only just a for loop but expressed pythonically. Also, if you need to remember a "simplistic" answer, there are at least hundreds of thousands of ways to remember them: starring the page on stackoverflow; copying and then pasting into an email; keeping a "helpful" file with stuff you want to remember; simply using a modern search engine whenever you need something; using bookmarks in (probably) every web browser; etc. – dylnmc Nov 02 '14 at 04:03
It is easier to understand but it has the downside that you must reference 'line' twice. – Damien Jan 05 '16 at 14:33
4

Great for breaking up long lines for printing, e.g. ``for i in range(0, len(string), n): print(string[i:i+n])`` – PatrickT Aug 06 '21 at 06:12
1

for any noobs like me who don't get list comprehensions, the following may be easier to understand, in place of the last line: `substrings = []` `for i in range(0, len(line), n): substring = line[i:i+n] substrings.append(substring)` – ArduinoBen May 15 '23 at 04:11

score 339 · Answer 2 · edited Oct 18 '19 at 09:44

339

Just to be complete, you can do this with a regex:

>>> import re
>>> re.findall('..','1234567890')
['12', '34', '56', '78', '90']

For odd number of chars you can do this:

>>> import re
>>> re.findall('..?', '123456789')
['12', '34', '56', '78', '9']

You can also do the following, to simplify the regex for longer chunks:

>>> import re
>>> re.findall('.{1,2}', '123456789')
['12', '34', '56', '78', '9']

And you can use re.finditer if the string is long to generate chunk by chunk.

edited Oct 18 '19 at 09:44

Georgy

12,464
7
65
73

answered Feb 28 '12 at 06:31

the wolf

34,510
13
53
71

15

This is by far the best answer here and deserves to be on top. One could even write `'.'*n` to make it more clear. No joining, no zipping, no loops, no list comprehension; just find the next two characters next to each other, which is exactly how a human brain thinks about it. If Monty Python were still alive, he'd love this method! – SO_fix_the_vote_sorting_bug Dec 12 '18 at 01:27
2

This is the fastest method for reasonably long strings too: https://gitlab.com/snippets/1908857 – Ralph Bolton Oct 30 '19 at 16:03
10

This won't work if the string contains newlines. This needs `flags=re.S`. – Aran-Fey Nov 14 '19 at 17:17
1

Yeah this is not a good answer. Regexes have so many gotchas (as Aran-Fey found!) that you should use them *very sparingly*. You definitely don't need them here. They're only faster because they're implemented in C and Python is crazy slow. – Timmmm Mar 22 '22 at 15:17
This is fast but more_itertools.sliced seems more efficient. – FifthAxiom Jun 01 '22 at 04:42

score 286 · Answer 3 · edited Mar 16 '23 at 21:57

286

There is already an inbuilt function in Python for this.

>>> from textwrap import wrap
>>> s = '1234567890'
>>> wrap(s, 2)
['12', '34', '56', '78', '90']

This is what the docstring for wrap says:

>>> help(wrap)
'''
Help on function wrap in module textwrap:

wrap(text, width=70, **kwargs)
    Wrap a single paragraph of text, returning a list of wrapped lines.

    Reformat the single paragraph in 'text' so it fits in lines of no
    more than 'width' columns, and return a list of wrapped lines.  By
    default, tabs in 'text' are expanded with string.expandtabs(), and
    all other whitespace characters (including newline) are converted to
    space.  See TextWrapper class for available keyword args to customize
    wrapping behaviour.
'''

edited Mar 16 '23 at 21:57

Eugene Yarmash

142,882
41
325
378

answered Feb 19 '18 at 06:57

Diptangsu Goswami

5,554
3
25
36

4

print(wrap('12345678', 3)) splits the string into groups of 3 digits, but starts in front and not behind. Result: ['123', '456', '78'] – Atalanttore May 20 '19 at 19:20
5

It is interesting to learn about 'wrap' yet it is not doing exactly what was asked above. It is more oriented towards displaying text, rather than splitting a string to a fixed number of characters. – Oren Jun 05 '19 at 15:21
14

`wrap` may not return what is asked for if the string contains space. e.g. `wrap('0 1 2 3 4 5', 2)` returns `['0', '1', '2', '3', '4', '5']` (the elements are stripped) – satomacoto Jun 20 '19 at 09:22
3

This indeed answers the question, but what happens if there's spaces and you want them maintained in the split characters? wrap() removes spaces if they fall straight after a split group of characters – Iron Attorney Jul 05 '19 at 18:56
2

This works poorly if you want to split text with hyphens (the number you give as argument is actually the MAXIMUM number of characters, not exact one, and it breaks i.e. on hyphens and white spaces). – MrVocabulary Aug 06 '19 at 14:11
`wrap()` appears to be pretty slow (and much slower than say the regex solution): https://gitlab.com/snippets/1908857 – Ralph Bolton Oct 30 '19 at 16:01
1

you can use `drop_whitespace=False` and `break_on_hyphens=False` to prevent the issues stated by satomacoto and MrVocabulary. See the [full documentation](https://docs.python.org/3/library/textwrap.html#textwrap.TextWrapper) – bmurauer Mar 25 '21 at 08:40
1

@Atalanttore Just do the following: `".".join(wrap(str(12345678)[::-1], 3))[::-1]` and you end up with `12.345.678`. – Gilfoyle May 02 '22 at 07:43
This is so slow. more_itertools.sliced and re.findall are much faster. – FifthAxiom Jun 01 '22 at 04:38

score 100 · Answer 4 · answered Feb 28 '12 at 02:25

100

Another common way of grouping elements into n-length groups:

>>> s = '1234567890'
>>> map(''.join, zip(*[iter(s)]*2))
['12', '34', '56', '78', '90']

This method comes straight from the docs for zip().

answered Feb 28 '12 at 02:25

Andrew Clark

202,379
35
273
306

2

In [19]: a = "hello world"; list( map( "".join, zip(*[iter(a)]*4) ) ) get the result ['hell', 'o wo']. – truease.com Apr 18 '13 at 15:54
21

If someone finds `zip(*[iter(s)]*2)` tricky to understand, read [How does `zip(*[iter(s)]*n)` work in Python?](http://stackoverflow.com/questions/2233204/how-does-zipitersn-work-in-python). – Grijesh Chauhan Jan 11 '14 at 14:49
19

This does not account for an odd number of chars, it'll simply drop those chars: `>>> map(''.join, zip(*[iter('01234567')]*5))` -> `['01234']` – Bjorn Sep 15 '14 at 19:39
4

To also handle odd number of chars just replace `zip()` with `itertools.zip_longest()`: `map(''.join, zip_longest(*[iter(s)]*2, fillvalue=''))` – Paulo Freitas Jun 08 '17 at 07:44
Also useful: docs for [`maps()`](https://docs.python.org/3/library/functions.html#map) – winklerrr Apr 23 '19 at 11:17
I hope I never find this in production. Incredibly difficult to read for something that should be rather simple – Neuron Dec 16 '22 at 12:23

score 77 · Answer 5 · edited Feb 01 '19 at 18:33

77

I think this is shorter and more readable than the itertools version:

def split_by_n(seq, n):
    '''A generator to divide a sequence into chunks of n units.'''
    while seq:
        yield seq[:n]
        seq = seq[n:]

print(list(split_by_n('1234567890', 2)))

edited Feb 01 '19 at 18:33

Diptangsu Goswami

5,554
3
25
36

answered Feb 28 '12 at 01:53

Russell Borogove

18,516
4
43
50

8

but not really efficient: when applied to strings: too many copies – Eric Aug 27 '15 at 21:17
1

It also doesn't work if seq is a generator, which is what the itertools version is _for_. Not that OP asked for that, but it's not fair to criticize itertool's version not being as simple. – mikenerone Jun 28 '17 at 20:47

score 41 · Answer 6 · answered Jun 22 '17 at 10:19

41

Using more-itertools from PyPI:

>>> from more_itertools import sliced
>>> list(sliced('1234567890', 2))
['12', '34', '56', '78', '90']

answered Jun 22 '17 at 10:19

Tim Diels

3,246
2
19
22

score 36 · Answer 7 · answered Sep 12 '15 at 23:14

36

I like this solution:

s = '1234567890'
o = []
while s:
    o.append(s[:2])
    s = s[2:]

answered Sep 12 '15 at 23:14

vlk

2,581
3
31
35

for loops are faster in python especially if you are iterating many times – Kaleba KB Keitshokile Apr 22 '23 at 09:45

Eugene Yarmash · Answer 8 · 2023-03-16T07:00:34.790

You could use the grouper() recipe from itertools:

Python 2.x:

from itertools import izip_longest    

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Python 3.x:

from itertools import zip_longest

def grouper(iterable, n, *, incomplete='fill', fillvalue=None):
    "Collect data into non-overlapping fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, fillvalue='x') --> ABC DEF Gxx
    # grouper('ABCDEFG', 3, incomplete='strict') --> ABC DEF ValueError
    # grouper('ABCDEFG', 3, incomplete='ignore') --> ABC DEF
    args = [iter(iterable)] * n
    if incomplete == 'fill':
        return zip_longest(*args, fillvalue=fillvalue)
    if incomplete == 'strict':
        return zip(*args, strict=True)
    if incomplete == 'ignore':
        return zip(*args)
    else:
        raise ValueError('Expected fill, strict, or ignore')

These functions are memory-efficient and work with any iterables.

Throwing an overflow when using very large strings (len=2**22*40) — FifthAxiom, Jun 01 '22 at 04:36
@FifthAxiom What version of Python and what kind of overflow are you talking about? — Eugene Yarmash, Mar 16 '23 at 06:55

score 16 · Answer 9 · edited May 23 '20 at 09:35

16

This can be achieved by a simple for loop.

a = '1234567890a'
result = []

for i in range(0, len(a), 2):
    result.append(a[i : i + 2])
print(result)

The output looks like ['12', '34', '56', '78', '90', 'a']

edited May 23 '20 at 09:35

Sunil Purushothaman

8,435
1
22
20

answered May 22 '20 at 18:02

Kasem777

737
7
10

4

While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value. – β.εηοιτ.βε May 22 '20 at 18:41
4

This is the same solution as here: https://stackoverflow.com/a/59091507/7851470 – Georgy May 22 '20 at 20:23
1

This is the same solution as the top voted answer - except for the fact that the top answer is using list comprehension. – Leonardus Chen Dec 07 '20 at 04:54

score 13 · Answer 10 · edited Nov 27 '22 at 05:01

13

I was stuck in the same scenario.

This worked for me:

x = "1234567890"
n = 2
my_list = []
for i in range(0, len(x), n):
    my_list.append(x[i:i+n])
print(my_list)

Output:

['12', '34', '56', '78', '90']

edited Nov 27 '22 at 05:01

Brainsluggy

31
7

answered Nov 28 '19 at 14:54

Strick

1,512
9
15

2

list is a reserved keyword in Python, you should change the variable name to something else such as `my_list`. – Justin Hammond Dec 04 '20 at 01:38

U13-Forward · Answer 11 · 2023-03-29T03:49:42.417

9

Try this:

s = '1234567890'
print([s[idx:idx+2] for idx in range(len(s)) if idx % 2 == 0])

Output:

['12', '34', '56', '78', '90']

edited Mar 29 '23 at 03:49

answered Jul 10 '18 at 03:46

U13-Forward

69,221
14
89
114

why enumerate(s) if you're going to ignore the val? just do `for i in range(len(s))`; why iterate over every value only to throw away half of them? just skip the values you don't need: `for i in range(0, len(s), 2)` (and skip the `if` part) – Arthur Tacca Mar 28 '23 at 15:54

score 8 · Answer 12 · answered Feb 28 '12 at 01:52

8

Try the following code:

from itertools import islice

def split_every(n, iterable):
    i = iter(iterable)
    piece = list(islice(i, n))
    while piece:
        yield piece
        piece = list(islice(i, n))

s = '1234567890'
print list(split_every(2, list(s)))

answered Feb 28 '12 at 01:52

enderskill

7,354
3
24
23

Your answer doesn't meet OP's requirement, you have to use `yield ''.join(piece)` to make it work as expected: https://eval.in/813878 – Paulo Freitas Jun 08 '17 at 08:15

score 6 · Answer 13 · edited Mar 16 '23 at 07:05

6

As always, for those who love one liners:

n = 2  
line = "this is a line split into n characters"  
line = [line[i * n:i * n+n] for i, blah in enumerate(line[::n])]

edited Mar 16 '23 at 07:05

Eugene Yarmash

142,882
41
325
378

answered May 20 '16 at 20:00

Sqripter

101
2
7

When I run this in Python Fiddle with a `print(line)` I get `this is a line split into n characters` as the output. Might you be better putting: `line = [line[i * n:i * n+n] for i,blah in enumerate(line[::n])]`? Fix this and it's a good answer :). – Peter David Carter May 20 '16 at 20:24
Can you explain the `,blah` and why it's necessary? I notice I can replace `blah` with any alpha character/s, but not numbers, and can't remove the `blah` or/and the comma. My editor suggests adding whitespace after `,` :s – toonarmycaptain Jul 17 '17 at 20:11
`enumerate` returns two iterables, so you need two places to put them. But you don't actually need the second iterable for anything in this case. – Daniel F Jul 27 '17 at 09:18
1

Rather than `blah` I prefer to use an underscore or double underscore, see: https://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-single-underscore-variable-in-python – Andy Royal Aug 15 '17 at 10:39

score 6 · Answer 14 · answered Feb 28 '12 at 01:56

>>> from functools import reduce
>>> from operator import add
>>> from itertools import izip
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x)]
['12', '34', '56', '78', '90']
>>> x = iter('1234567890')
>>> [reduce(add, tup) for tup in izip(x, x, x)]
['123', '456', '789']

pylang · Answer 15 · 2018-03-30T22:12:42.620

more_itertools.sliced has been mentioned before. Here are four more options from the more_itertools library:

s = "1234567890"

["".join(c) for c in mit.grouper(2, s)]

["".join(c) for c in mit.chunked(s, 2)]

["".join(c) for c in mit.windowed(s, 2, step=2)]

["".join(c) for c in  mit.split_after(s, lambda x: int(x) % 2 == 0)]

Each of the latter options produce the following output:

['12', '34', '56', '78', '90']

Documentation for discussed options: grouper, chunked, windowed, split_after

englealuze · Answer 16 · 2018-10-22T11:41:24.380

A simple recursive solution for short string:

def split(s, n):
    if len(s) < n:
        return []
    else:
        return [s[:n]] + split(s[n:], n)

print(split('1234567890', 2))

Or in such a form:

def split(s, n):
    if len(s) < n:
        return []
    elif len(s) == n:
        return [s]
    else:
        return split(s[:n], n) + split(s[n:], n)

, which illustrates the typical divide and conquer pattern in recursive approach more explicitly (though practically it is not necessary to do it this way)

score 2 · Answer 17 · answered Jul 23 '21 at 23:08

A solution with groupby:

from itertools import groupby, chain, repeat, cycle

text = "wwworldggggreattecchemggpwwwzaz"
n = 3
c = cycle(chain(repeat(0, n), repeat(1, n)))
res = ["".join(g) for _, g in groupby(text, lambda x: next(c))]
print(res)

Output:

['www', 'orl', 'dgg', 'ggr', 'eat', 'tec', 'che', 'mgg', 'pww', 'wza', 'z']

Yosef Bernal · Answer 18 · 2022-07-22T09:36:14.727

0

These answers are all nice and working and all, but the syntax is so cryptic... Why not write a simple function?

def SplitEvery(string, length):
    if len(string) <= length: return [string]        
    sections = len(string) / length
    lines = []
    start = 0;
    for i in range(sections):
        line = string[start:start+length]
        lines.append(line)
        start += length
    return lines

And call it simply:

text = '1234567890'
lines = SplitEvery(text, 2)
print(lines)

# output: ['12', '34', '56', '78', '90']

edited Jul 22 '22 at 09:36

answered Jul 22 '22 at 09:12

Yosef Bernal

1,006
9
20

1

You cannot pass a float to the range function, so the function you display wouldn't work. (Try running it if you don't believe me) – cd-CreepArghhh Oct 03 '22 at 10:15

score 0 · Answer 19 · answered Jan 23 '23 at 08:13

Another solution using groupby and index//n as the key to group the letters:

from itertools import groupby

text = "abcdefghij"
n = 3

result = []
for idx, chunk in groupby(text, key=lambda x: x.index//n):
    result.append("".join(chunk))

# result = ['abc', 'def', 'ghi', 'j']

Split string every nth character

19 Answers19

Linked

Related