1

I suppose to split this sequence into a list of n=3.

codons('agucaccgucautc')
# result = ['agu','cac','cgu','cau']
# 'tc' is supposed to be ignored as it doesn't equal to n=3

I've tried the following solution;

def codons(RNA): 
    """This functions returns a list of codons present in an RNA sequence"""

    # store the length of string
    length = len(RNA)

    #divide the string in n equal parts
    n = 3
    temp = 0
    chars = int(len(RNA)/3)

    #stores the array of string
    change = []

    #check whether a string can be divided into n equal parts
    for i in range(0, length, chars):
        part = [RNA[i:i+3] for i in range(0, length, n)];
        change.append(part);
        return part

        if (length % n != 0):
            continue

But when I try to run the previous code again, it still returns 'tc'

codons('agucaccgucautc')
# result = ['agu', 'cac', 'cgu', 'cau', 'tc']

Can anybody help me what should I do to ignore any chars that not equal to n=3 or the last part 'tc'?

Rehearse
  • 31
  • 5
  • `[s[i:i+sz] for i in range(0, len(s) - sz + 1, sz)]` drops the last chunk, otherwise see the dupe link. – ggorlen Oct 10 '20 at 05:36

1 Answers1

0

You could use list-comprehension in the follwoing way:

s = 'agucaccgucautc'
n = 3
out = [(s[i:i+n]) for i in range(0, len(s), n) if len(s[i:i+n])%n == 0] 
David
  • 8,113
  • 2
  • 17
  • 36