Out of curiousity, is there a way on Python that the computer/program can count in thirds without using range, but instead with slices and indices? For example, what if you had a codon like 'CAGCAGCAT'. Could python divide that string into thirds like this: CAG CAG CAT? I tried to, but I failed. If there's a way, show me how. I'm curious
Asked
Active
Viewed 178 times
2
-
http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python is related. – rlms Feb 26 '14 at 21:47
-
It's funny how the he specifically said "no range" and 4 out of 6 below use range. :) – Russia Must Remove Putin Feb 26 '14 at 21:51
-
Oh funny, I didn’t even read that. Wonder where that requirement comes from… – poke Feb 26 '14 at 21:54
-
Even though you didn't ask for. Maybe you are interested in [Biopython](http://biopython.org/DIST/docs/tutorial/Tutorial.html): Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. – wolfrevo Feb 26 '14 at 21:57
6 Answers
3
import textwrap
textwrap.wrap('CAGCAGCAT' ,3)
returns
['CAG', 'CAG', 'CAT']

Russia Must Remove Putin
- 374,368
- 89
- 403
- 331
2
You could use the grouper recipe, zip(*[iterator]*n)
, to collect items without using range
.
In [96]: data = 'CAGCAGCAT'
In [97]: [''.join(grp) for grp in zip(*[iter(data)]*3)]
Out[97]: ['CAG', 'CAG', 'CAT']
If len(data)
is not a multiple of 3, then the above chops off the remainder. To prevent that, use itertools.izip_longest:
In [102]: import itertools as IT
In [108]: [''.join(grp) for grp in IT.izip_longest(*[iter('CAGCAGCATCA')]*3, fillvalue='')]
Out[108]: ['CAG', 'CAG', 'CAT', 'CA']
By the way, grouper recipe works with any iterator. textwrap.wrap
works only with strings. Moreover, the grouper recipe is faster:
In [100]: %timeit textwrap.wrap(data, 3)
10000 loops, best of 3: 17.7 µs per loop
In [101]: %timeit [''.join(grp) for grp in zip(*[iter(data)]*3)]
100000 loops, best of 3: 1.78 µs per loop
Also note that textwrap.wrap
may not group your string into groups of 3 characters if the string contains spaces:
In [42]: textwrap.wrap('I am a hat', 3)
Out[42]: ['I', 'am', 'a', 'hat']

unutbu
- 842,883
- 184
- 1,785
- 1,677
-
-
this is my favorite way to do it (although it can leave off some end items if the length is not `0 (mod n)` – Joran Beasley Feb 26 '14 at 21:51
-
1
-
+1 lol I never thought of that I usually just tack on `+ data[-len(data)%n:]` – Joran Beasley Feb 26 '14 at 21:55
1
>>> s = 'CAGCAGCAT'
>>> [''.join(g) for g in zip(s[::3], s[1::3], s[2::3])]
['CAG', 'CAG', 'CAT']

ndpu
- 22,225
- 6
- 54
- 69
1
You can use the list comprehension, the third parameter of range is a step:
>>> s = "CAGCAGCAT"
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
>>>

pkacprzak
- 5,537
- 1
- 17
- 37
1
You can use the grouper
itertools
recipe:
>>> s = 'CAGCAGCAT'
>>> list(grouper(s, 3))
[('C', 'A', 'G'), ('C', 'A', 'G'), ('C', 'A', 'T')]
Or in your case, you can also use simple slices:
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']

poke
- 369,085
- 72
- 557
- 602
1
def chunker(seq, size):
return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))
Stolen from What is the most "pythonic" way to iterate over a list in chunks?