1

I am trying to use recursion to compare values of elements between two lists. For an index i, if list1[i] = A and list2[i] = T and vice versa, the values are valid. Also, for an index i, if list1[i] = G and list2[i] = C and vice versa, the values are valid. However, if these two conditions are not met (list1[i] = A and list2[i] = G), then the values are not valid. I created two new lists where valid values are stores and invalid values are replaced with blanks ' '.

This code works fine with a for loop (for i in range (0, len(list1)):) but how would one use recursion?

def check_valid(sequence1, sequence2):

 sequence1 = list(sequence1)

 sequence2 = list(sequence2)

 valid_units1 = []

 valid_units2 = []

    if (sequence1[i] == 'A' and sequence2[i] == 'T') or (sequence1[i] == 'T' and sequence2[i] == 'A'):
        valid_units1.append(sequence1[i])
        valid_units2.append(sequence2[i])
    elif (sequence1[i] == 'G' and sequence2[i] == 'C') or (sequence1[i] == 'C' and sequence2[i] == 'G'):
        valid_units1.append(sequence1[i])
        valid_units2.append(sequence2[i])
    else:
        valid_units1.append(' ')
        valid_units2.append(' ')

    print(valid_units1)
    print(valid_units2)

    valid_units = [valid_units1, valid_units2]
    return valid_units
Mark
  • 90,562
  • 7
  • 108
  • 148
Tintin
  • 13
  • 3
  • 1
    I think you should ask *if* one should use recursions. It doesn't seem to have any benefit here. And this can be done with `zip()` and a simple list comprehension. – Mark Apr 21 '19 at 21:36
  • I'm sorry, I'm new to Python, not sure what a zip() or a list comprehension are. Would you mind explaining what you mean? – Tintin Apr 21 '19 at 21:49

2 Answers2

1

If you are starting with two lists and want to compare them element-wise you can use zip() to simplify your life. It will give you back the elements paired off. So if you are starting with two lists, you can zip them:

list1 = ['A', 'T']
list2 = ['C', 'G']

zipped = list(zip(list1, list2))
# zipped is [('A', 'C'), ('T', 'G')]

#or use a list comprehension:
zipped = [pair for pair in  zip(list1, list2)]

zip() returns an iterator which is why it's wrapped in list() above. If you use it in a loop or other situation calling for a interator, you don't need to do that.

If you want to compare these you can use a dictionary that defines which letter maps to the other, this will allow you write a much simpler test function:

# Define a mapping that describes which elements belong togethre
pairs = {
    'G':'T',
    'T':'G',
    'C':'A',
    'A':'C'
}

list1 = ['A', 'A', 'T', 'C', 'G', 'C', 'T', 'A']
list2 = ['C', 'G', 'G', 'A', 'C', 'A', 'C', 'T']

# make a new list if the pairs line up with the mapping:
legal = [(a, b) for a, b in  zip(list1, list2) if pairs[a] == b ]

# legal pairs: [('A', 'C'), ('T', 'G'), ('C', 'A'), ('C', 'A')]

There's not much reason to do this recursively, but you of course can. Since zip() returns an iterator (pairs below) you can call next() on it to get there next value and then pass the iterator back. It will throw a StopIteration error when it's out of items, so that can be the edge condition of the recursion:

def buildList(pairs, mapping):
    ''' Takes an iterator and mapping, returns a list of legal items defined by mapping '''
    try:
        a_pair = next(pairs)  # get first item from zipped list
    except StopIteration:   # no more items, just return an empty list
        return []

    a, b = a_pair            
    if mapping[a] == b:    
        return [(a, b)] + buildList(pairs, mapping)
    else: 
        return buildList(pairs, mapping)


list1 = ['A', 'A', 'T', 'C', 'G', 'C', 'T', 'A']
list2 = ['C', 'G', 'G', 'A', 'C', 'A', 'C', 'T']
pairs = {'G':'T','T':'G','C':'A','A':'C'}

buildList(zip(list1, list2), pairs) # use zip to make the zipped iterator
Mark
  • 90,562
  • 7
  • 108
  • 148
0

This question is tagged with recursion and here's my take on it -

def head(xs = []):
  return xs[0]

def tail(xs = []):
  return xs[1:]

def check_pair(x = "", y = "", swap = True):
  if x == "A" and y == "C":
    return True
  elif x == "G" and y == "T":
    return True
  else:
    return swap and check_pair(y, x, False)

def check_valid(a = [], b = []):
  if (not a) or (not b):
    return
  elif check_pair(head(a), head(b)):
    yield (head(a), head(b))
    yield from check_valid(tail(a), tail(b))
  else:
    yield from check_valid(tail(a), tail(b))


a = ['A', 'A', 'T', 'C', 'G', 'C', 'T', 'A']
b = ['C', 'G', 'G', 'A', 'C', 'A', 'C', 'T']

print(list(check_valid(a,b)))
# [('A', 'C'), ('T', 'G'), ('C', 'A'), ('C', 'A')]

This is intuitive, but like zip, the tail function creates intermediate values. We can reduce memory requirement by using a simple index, i -

def check_pair(x = "", y = "", swap = True):
  if x == "A" and y == "C":
    return True
  elif x == "G" and y == "T":
    return True
  else:
    return swap and check_pair(y, x, False)

def check_valid(a = [], b = [], i = 0):
  if i >= min(len(a), len(b)):
    return
  elif check_pair(a[i], b[i]):
    yield (a[i], b[i])
    yield from check_valid(a, b, i + 1)
  else:
    yield from check_valid(a, b, i + 1)

a = ['A', 'A', 'T', 'C', 'G', 'C', 'T', 'A']
b = ['C', 'G', 'G', 'A', 'C', 'A', 'C', 'T']

print(list(check_valid(a,b)))
# [('A', 'C'), ('T', 'G'), ('C', 'A'), ('C', 'A')]

You may be interested in this related Q&A

Mulan
  • 129,518
  • 31
  • 228
  • 259