Iterate over a string 2 (or n) characters at a time in Python

Question

Earlier today I needed to iterate over a string 2 characters at a time for parsing a string formatted like "+c-R+D-E" (there are a few extra letters).

I ended up with this, which works, but it looks ugly. I ended up commenting what it was doing because it felt non-obvious. It almost seems pythonic, but not quite.

# Might not be exact, but you get the idea, use the step
# parameter of range() and slicing to grab 2 chars at a time
s = "+c-R+D-e"
for op, code in (s[i:i+2] for i in range(0, len(s), 2)):
  print op, code

Are there some better/cleaner ways to do this?

possible duplicate of [What is the most "pythonic" way to iterate over a list in chunks?](http://stackoverflow.com/questions/434287/what-is-the-most-pythonic-way-to-iterate-over-a-list-in-chunks) — Cristian Ciupitu, Jun 28 '14 at 22:34

score 59 · Accepted Answer · edited Feb 19 '21 at 20:51

59

I don't know about cleaner, but there's another alternative:

for (op, code) in zip(s[0::2], s[1::2]):
    print op, code

A no-copy version:

from itertools import izip, islice
for (op, code) in izip(islice(s, 0, None, 2), islice(s, 1, None, 2)):
    print op, code

edited Feb 19 '21 at 20:51

peterh

11,875
18
85
108

answered Jul 22 '09 at 01:39

Pavel Minaev

99,783
25
219
289

1

I really like this one...i just wish it didn't make copies to iterate over. – Richard Levasseur Jul 22 '09 at 19:00
The zip approach skips the last character in case the string has odd number of characters. – codeforester Nov 13 '20 at 04:41
For python3 the "no-copy version" is not necessary (and is actually no longer valid). See https://stackoverflow.com/questions/32659552/importing-izip-from-itertools-module-gives-nameerror-in-python-3-x – shao.lo Jan 01 '22 at 02:09
Slices are still copies in Python 3, so you still need `islice`. – Pavel Minaev Jan 02 '22 at 11:46
Generalizing the idea, in Python 3 you can write: `for ... in zip(*(islice(s, i, None, n) for i in range(n))):` – SnzFor16Min Jan 20 '23 at 06:59

score 18 · Answer 2 · answered Jul 22 '09 at 01:35

18

Maybe this would be cleaner?

s = "+c-R+D-e"
for i in xrange(0, len(s), 2):
    op, code = s[i:i+2]
    print op, code

You could perhaps write a generator to do what you want, maybe that would be more pythonic :)

answered Jul 22 '09 at 01:35

Paggas

1,851
3
15
15

+1 simple and it works for any n (if ValueError exception is handled when len(s) is not a multiple of n. – mhawke Jul 22 '09 at 03:04

sunqiang · Answer 3 · 2009-08-02T15:00:33.337

6

from itertools import izip_longest
def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return izip_longest(*args, fillvalue=fillvalue)
def main():
    s = "+c-R+D-e"
    for item in grouper(s, 2):
        print ' '.join(item)
if __name__ == "__main__":
    main()
##output
##+ c
##- R
##+ D
##- e

izip_longest requires Python 2.6( or higher). If on Python 2.4 or 2.5, use the definition for izip_longest from the document or change the grouper function to:

from itertools import izip, chain, repeat
def grouper(iterable, n, padvalue=None):
    return izip(*[chain(iterable, repeat(padvalue, n-1))]*n)

edited Aug 02 '09 at 15:00

answered Jul 22 '09 at 01:36

sunqiang

6,422
1
32
32

2

Best answer, except it's renamed to `zip_longest` [in Python3](https://docs.python.org/3.4/library/itertools.html#itertools.zip_longest). – cdunn2001 Jul 08 '14 at 19:29

score 6 · Answer 4 · edited May 23 '17 at 12:26

6

Triptych inspired this more general solution:

def slicen(s, n, truncate=False):
    assert n > 0
    while len(s) >= n:
        yield s[:n]
        s = s[n:]
    if len(s) and not truncate:
        yield s

for op, code in slicen("+c-R+D-e", 2):
    print op,code

edited May 23 '17 at 12:26

Community

1
1

answered Jul 22 '09 at 03:43

mhawke

84,695
9
117
138

Kenan Banks · Answer 5 · 2009-07-22T05:46:50.027

5

Great opportunity for a generator. For larger lists, this will be much more efficient than zipping every other elemnent. Note that this version also handles strings with dangling ops

def opcodes(s):
    while True:
        try:
            op   = s[0]
            code = s[1]
            s    = s[2:]
        except IndexError:
            return
        yield op,code        


for op,code in opcodes("+c-R+D-e"):
   print op,code

edit: minor rewrite to avoid ValueError exceptions.

edited Jul 22 '09 at 05:46

answered Jul 22 '09 at 02:59

Kenan Banks

207,056
34
155
173

1

A few edge cases - always raises ValueError: try opcodes("a1") – mhawke Jul 22 '09 at 03:22

score 3 · Answer 6 · answered Oct 31 '13 at 06:00

This approach support an arbitrary number of elements per result, evaluates lazily, and the input iterable can be a generator (no indexing is attempted):

import itertools

def groups_of_n(n, iterable):
    c = itertools.count()
    for _, gen in itertools.groupby(iterable, lambda x: c.next() / n):
        yield gen

Any left-over elements are returned in a shorter list.

Example usage:

for g in groups_of_n(4, xrange(21)):
    print list(g)

[0, 1, 2, 3]
[4, 5, 6, 7]
[8, 9, 10, 11]
[12, 13, 14, 15]
[16, 17, 18, 19]
[20]

score 2 · Answer 7 · answered Jul 22 '09 at 02:40

The other answers work well for n = 2, but for the general case you could try this:

def slicen(s, n, truncate=False):
    nslices = len(s) / n
    if not truncate and (len(s) % n):
        nslices += 1
    return (s[i*n:n*(i+1)] for i in range(nslices))

>>> s = '+c-R+D-e'
>>> for op, code in slicen(s, 2):
...     print op, code
... 
+ c
- R
+ D
- e

>>> for a, b, c in slicen(s, 3):
...     print a, b, c
... 
+ c -
R + D
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: need more than 2 values to unpack

>>> for a, b, c in slicen(s,3,True):
...     print a, b, c
... 
+ c -
R + D

score 2 · Answer 8 · answered Apr 22 '14 at 06:21

2

Here's my answer, a little bit cleaner to my eyes:

for i in range(0, len(string) - 1):
    if i % 2 == 0:
        print string[i:i+2]

answered Apr 22 '14 at 06:21

Eric Carmichael

560
4
15

3

`range` does support a step too ;) -- `for i in range(0, len(str), 2): print str[i:i+2]` – movatica Jun 28 '19 at 21:23

score 2 · Answer 9 · answered Dec 03 '16 at 01:58

2

Consider pip installing more_itertools, which already ships with a chunked implementation along with other helpful tools:

import more_itertools 

for op, code in more_itertools.chunked(s, 2):
    print(op, code)

Output:

+ c
- R
+ D
- e

answered Dec 03 '16 at 01:58

pylang

40,867
14
129
121

score 1 · Answer 10 · answered Jul 22 '09 at 01:28

1

>>> s = "+c-R+D-e"
>>> s
'+c-R+D-e'
>>> s[::2]
'+-+-'
>>>

answered Jul 22 '09 at 01:28

ghostdog74

327,991
56
259
343

score 1 · Answer 11 · answered Jul 22 '09 at 05:10

1

Maybe not the most efficient, but if you like regexes...

import re
s = "+c-R+D-e"
for op, code in re.findall('(.)(.)', s):
    print op, code

answered Jul 22 '09 at 05:10

epost

51
1
2
4

score 0 · Answer 12 · answered Apr 01 '14 at 22:53

0

I ran into a similar problem. Ended doing something like this:

ops = iter("+c-R+D-e")
for op in ops
    code = ops.next()

    print op, code

I felt it was the most readable.

answered Apr 01 '14 at 22:53

Xavi

20,111
14
72
63

score 0 · Answer 13 · answered Dec 08 '21 at 14:14

0

I made this simple generator:

def every_two(s):
    d = list(s)
    c = True
    for i in range(len(d)):
        if c:
            c = False
            yield d[i], d[i+1]
        else:
            c = True

It will raise an IndexError if the lenght of the string isn't divisible by two but you can just wrap the yield statement in a try block.

answered Dec 08 '21 at 14:14

Flampt

23
3

Thanks for sharing your answer on the platform. But for this question, an answer working for all `n` would be helpful – Vineet Dec 08 '21 at 16:29
Answer is incorrect – Vineet Dec 08 '21 at 16:29

Iterate over a string 2 (or n) characters at a time in Python

13 Answers13

Linked

Related