0

I have a 34-mer string like

ATGGGGTTTCCC...CTG

I want to get all possible 6-mer substrings in this string. Can you suggest a good way to do this.

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Ssank
  • 3,367
  • 7
  • 28
  • 34
  • Quite close http://stackoverflow.com/questions/21303224/iterate-over-all-pairs-of-consecutive-items-from-a-given-list though not an exact dupe – Bhargav Rao Jun 10 '15 at 18:12

1 Answers1

1

Assuming they have to be contiguous, you can use slicing in a list comprehension

>>> s = 'AGTAATGGCGATTGAGGGTCCACTGTCCTGGTAC'
>>> [s[i:i+6] for i in range(len(s)-5)]
['AGTAAT', 'GTAATG', 'TAATGG', 'AATGGC', 'ATGGCG', 'TGGCGA', 'GGCGAT', 'GCGATT', 'CGATTG', 'GATTGA', 'ATTGAG', 'TTGAGG', 'TGAGGG', 'GAGGGT', 'AGGGTC', 'GGGTCC', 'GGTCCA', 'GTCCAC', 'TCCACT', 'CCACTG', 'CACTGT', 'ACTGTC', 'CTGTCC', 'TGTCCT', 'GTCCTG', 'TCCTGG', 'CCTGGT', 'CTGGTA', 'TGGTAC']
Cory Kramer
  • 114,268
  • 16
  • 167
  • 218