-4

I want to compare two DNA sequences and return the identical nucleotides in a pair list (position in sequence 1, position in sequence 2)

input:

a = [G, T, T, U, I, P]
b = [E, G, T, P]

output:

[[0,1], [1,2], [2,2], [5,3]]
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
sir_Ouss
  • 1
  • 3
  • 1
    Are you after *all* pairs? So if you had `a=['T', 'T', 'T']; b = ['T', 'T', 'T']` you'd have 9 results? – Jon Clements Nov 11 '18 at 23:49
  • 5
    Did you write any code for this? You need to share the code and explain what exact issue you are facing in that – Chetan Nov 11 '18 at 23:50

2 Answers2

1

You can do it with for loops:

a_s = ["G", "T", "T", "U", "I", "P"]
b_s = ["E", "G", "T", "P"]

d = []
for i,a in  enumerate(a_s):
    for j,b in enumerate(b_s):
        if a == b:
            d.append([i,j])
print(d) 

Out:

[[0, 1], [1, 2], [2, 2], [5, 3]]

Or you can do it in a single row:

a_s = ["G", "T", "T", "U", "I", "P"]
b_s = ["E", "G", "T", "P"]    

print([[x, y] for x, av in enumerate(a_s) for y, bv in enumerate(b_s) if av == bv])

With the above, same output.

Note: The first version is in most case more readable, the second is more concise. You can always chose any of both depending on the code context and the purpose of it.

Geeocode
  • 5,705
  • 3
  • 20
  • 34
0

Two examples leveraging "product" from the "itertools" module.

The first is a traditional for loop that appends a list.

The second is a list comprehension equivalent.

from itertools import product

a = list('GTTUIP')
b = list('EGTP')

# Without a comprehension.
results = []
for (x, a_s), (y, b_s) in product(enumerate(a), enumerate(b)):
    if a_s == b_s:
        results.append([x, y])
print(results)
 
# With a comprehension
results = [[x, y]
          for (x, a_s), (y, b_s) 
          in product(enumerate(a), enumerate(b)) 
          if a_s == b_s]
print(results)

OUT:

[[0, 1], [1, 2], [2, 2], [5, 3]]

[[0, 1], [1, 2], [2, 2], [5, 3]]

Community
  • 1
  • 1
dmmfll
  • 2,666
  • 2
  • 35
  • 41
  • For what it's worth, using `itertools.product` is roughly 10% faster. See this gist using the `timeit` module. Each comprehension is run 1 million times: https://gist.github.com/bb4f02e391c7e15df947df6918e7bd93 – dmmfll Nov 12 '18 at 10:56