Counting in thirds on Python

Question

Out of curiousity, is there a way on Python that the computer/program can count in thirds without using range, but instead with slices and indices? For example, what if you had a codon like 'CAGCAGCAT'. Could python divide that string into thirds like this: CAG CAG CAT? I tried to, but I failed. If there's a way, show me how. I'm curious

http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python is related. — rlms, Feb 26 '14 at 21:47
It's funny how the he specifically said "no range" and 4 out of 6 below use range. :) — Russia Must Remove Putin, Feb 26 '14 at 21:51
Oh funny, I didn’t even read that. Wonder where that requirement comes from… — poke, Feb 26 '14 at 21:54
Even though you didn't ask for. Maybe you are interested in [Biopython](http://biopython.org/DIST/docs/tutorial/Tutorial.html): Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. — wolfrevo, Feb 26 '14 at 21:57

score 3 · Answer 1 · answered Feb 26 '14 at 21:47

3

import textwrap
textwrap.wrap('CAGCAGCAT' ,3)

returns

['CAG', 'CAG', 'CAT']

answered Feb 26 '14 at 21:47

Russia Must Remove Putin

374,368
89
403
331

unutbu · Answer 2 · 2014-02-27T13:01:51.237

2

You could use the grouper recipe, zip(*[iterator]*n), to collect items without using range.

In [96]: data = 'CAGCAGCAT'

In [97]: [''.join(grp) for grp in zip(*[iter(data)]*3)]
Out[97]: ['CAG', 'CAG', 'CAT']

If len(data) is not a multiple of 3, then the above chops off the remainder. To prevent that, use itertools.izip_longest:

In [102]: import itertools as IT
In [108]: [''.join(grp) for grp in IT.izip_longest(*[iter('CAGCAGCATCA')]*3, fillvalue='')]
Out[108]: ['CAG', 'CAG', 'CAT', 'CA']

By the way, grouper recipe works with any iterator. textwrap.wrap works only with strings. Moreover, the grouper recipe is faster:

In [100]: %timeit textwrap.wrap(data, 3)
10000 loops, best of 3: 17.7 µs per loop

In [101]: %timeit [''.join(grp) for grp in zip(*[iter(data)]*3)]
100000 loops, best of 3: 1.78 µs per loop

Also note that textwrap.wrap may not group your string into groups of 3 characters if the string contains spaces:

In [42]: textwrap.wrap('I am a hat', 3)
Out[42]: ['I', 'am', 'a', 'hat']

edited Feb 27 '14 at 13:01

answered Feb 26 '14 at 21:47

unutbu

842,883
184
1,785
1,677

No range, impressive! – Russia Must Remove Putin Feb 26 '14 at 21:49
this is my favorite way to do it (although it can leave off some end items if the length is not `0 (mod n)` – Joran Beasley Feb 26 '14 at 21:51
1

@JoranBeasley: True; in that case, use `itertools.izip_longest`. – unutbu Feb 26 '14 at 21:52
+1 lol I never thought of that I usually just tack on `+ data[-len(data)%n:]` – Joran Beasley Feb 26 '14 at 21:55

ndpu · Answer 3 · 2014-02-26T22:04:13.110

1

>>> s = 'CAGCAGCAT'
>>> [''.join(g) for g in zip(s[::3], s[1::3], s[2::3])]
['CAG', 'CAG', 'CAT']

edited Feb 26 '14 at 22:04

answered Feb 26 '14 at 21:48

ndpu

22,225
6
54
69

score 1 · Answer 4 · answered Feb 26 '14 at 21:48

1

You can use the list comprehension, the third parameter of range is a step:

>>> s = "CAGCAGCAT"
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
>>>

answered Feb 26 '14 at 21:48

pkacprzak

5,537
1
17
37

score 1 · Answer 5 · answered Feb 26 '14 at 21:48

You can use the grouper itertools recipe:

>>> s = 'CAGCAGCAT'
>>> list(grouper(s, 3))
[('C', 'A', 'G'), ('C', 'A', 'G'), ('C', 'A', 'T')]

Or in your case, you can also use simple slices:

>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']

score 1 · Answer 6 · edited May 23 '17 at 12:20

1

def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

Stolen from What is the most "pythonic" way to iterate over a list in chunks?

edited May 23 '17 at 12:20

Community

1
1

answered Feb 26 '14 at 21:48

Chris B.

85,731
25
98
139

Counting in thirds on Python

6 Answers6