2

Out of curiousity, is there a way on Python that the computer/program can count in thirds without using range, but instead with slices and indices? For example, what if you had a codon like 'CAGCAGCAT'. Could python divide that string into thirds like this: CAG CAG CAT? I tried to, but I failed. If there's a way, show me how. I'm curious

  • http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python is related. – rlms Feb 26 '14 at 21:47
  • It's funny how the he specifically said "no range" and 4 out of 6 below use range. :) – Russia Must Remove Putin Feb 26 '14 at 21:51
  • Oh funny, I didn’t even read that. Wonder where that requirement comes from… – poke Feb 26 '14 at 21:54
  • Even though you didn't ask for. Maybe you are interested in [Biopython](http://biopython.org/DIST/docs/tutorial/Tutorial.html): Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. – wolfrevo Feb 26 '14 at 21:57

6 Answers6

3
import textwrap
textwrap.wrap('CAGCAGCAT' ,3)

returns

['CAG', 'CAG', 'CAT']
Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
2

You could use the grouper recipe, zip(*[iterator]*n), to collect items without using range.

In [96]: data = 'CAGCAGCAT'

In [97]: [''.join(grp) for grp in zip(*[iter(data)]*3)]
Out[97]: ['CAG', 'CAG', 'CAT']

If len(data) is not a multiple of 3, then the above chops off the remainder. To prevent that, use itertools.izip_longest:

In [102]: import itertools as IT
In [108]: [''.join(grp) for grp in IT.izip_longest(*[iter('CAGCAGCATCA')]*3, fillvalue='')]
Out[108]: ['CAG', 'CAG', 'CAT', 'CA']

By the way, grouper recipe works with any iterator. textwrap.wrap works only with strings. Moreover, the grouper recipe is faster:

In [100]: %timeit textwrap.wrap(data, 3)
10000 loops, best of 3: 17.7 µs per loop

In [101]: %timeit [''.join(grp) for grp in zip(*[iter(data)]*3)]
100000 loops, best of 3: 1.78 µs per loop

Also note that textwrap.wrap may not group your string into groups of 3 characters if the string contains spaces:

In [42]: textwrap.wrap('I am a hat', 3)
Out[42]: ['I', 'am', 'a', 'hat']
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
1
>>> s = 'CAGCAGCAT'
>>> [''.join(g) for g in zip(s[::3], s[1::3], s[2::3])]
['CAG', 'CAG', 'CAT']
ndpu
  • 22,225
  • 6
  • 54
  • 69
1

You can use the list comprehension, the third parameter of range is a step:

>>> s = "CAGCAGCAT"
>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
>>> 
pkacprzak
  • 5,537
  • 1
  • 17
  • 37
1

You can use the grouper itertools recipe:

>>> s = 'CAGCAGCAT'
>>> list(grouper(s, 3))
[('C', 'A', 'G'), ('C', 'A', 'G'), ('C', 'A', 'T')]

Or in your case, you can also use simple slices:

>>> [s[i:i+3] for i in range(0, len(s), 3)]
['CAG', 'CAG', 'CAT']
poke
  • 369,085
  • 72
  • 557
  • 602
1
def chunker(seq, size):
    return (seq[pos:pos + size] for pos in xrange(0, len(seq), size))

Stolen from What is the most "pythonic" way to iterate over a list in chunks?

Community
  • 1
  • 1
Chris B.
  • 85,731
  • 25
  • 98
  • 139