0

I'm new to python. I have the following task to iterate all pairs of the values of a dictionary to calculate a value (hamming distance of the two sequences (each value pair is a sequence pair)). Then, I need to print out the corresponding keys if the hamming distance calculated is 1. The code is as follows.

import numpy as np
import itertools
from scipy.spatial.distance import hamming

graph=[]
i=5
t1=tuple('aaaaa')
t2=tuple('aaaab')
t3=tuple('aaaac')
t4=tuple('aaaad')
t5=tuple('aaaaa')

population={'1':t1, '2':t2, '3':t3, '4':t4, '5':t5}

for pair in itertools.combinations(np.array(population.values()),2):
    if hamming(pair[0],pair[1])*i==1: graph.append(str(population.keys()[population.values().index(pair[0])]) +'\t' + str(population.keys()[population.values().index(pair[1])]) +'n')

print graph 

The error is:

Traceback (most recent call last):
  File "test.py", line 18, in <module>
    if hamming(pair[0],pair[1])*len==1: graph.append(str(population.keys()[population.values().index(pair[0])]) +'\t' + str(population.keys()[population.values().index(pair[1])]) +'n')
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Any comment is greatly appreciated. Edit: I learned to access the key from the value of a dictionary from this link: Get key by value in dictionary Edit: to avoid the variable to be built-in.

Community
  • 1
  • 1
qaz
  • 35
  • 6
  • 2
    Hint for the future: it's generally nice to not only give the error you received, but also *where* the error occurred. Otherwise we answerers need to comb through your code before even beginning to understand what the problem is. Good luck! – jme Mar 10 '15 at 17:00
  • 3
    Don't try to do so much on one line. That long line you have there is an unreadable monstrosity. Besides making the code more readable, splitting the line up will help you by making the line reported in the traceback more specific. – interjay Mar 10 '15 at 17:09
  • And if you're using the dictionary to find a key based on the value, then your dictionary is reversed and you should swap the keys and values. – interjay Mar 10 '15 at 17:11
  • Also please don't name your variables like built-in functions (`len`). – mkrieger1 Mar 10 '15 at 17:13
  • @interjay, I understand. I thought about this. But I cannot. Because the keys are supposed to be distinct, while in my case, I need duplicate keys. – qaz Mar 10 '15 at 17:13
  • @qaz Don't worry too much about that for now. Focus on breaking down the code into simpler statements to pinpoint where that error is occurring, and it'll be easier to find out why. – Tim Pierce Mar 10 '15 at 17:15
  • If you don't convert `population.values()` to a `np.array` the code runs without error, but I'm not sure if it does what you want. – mkrieger1 Mar 10 '15 at 17:22
  • What if the values are not unique, as in this case? `t5` has the same value as `t1`. – emvee Mar 10 '15 at 17:27
  • I've found another way to get around this situation by switching the keys and values and using a list of integers instead of an integer as the value. But still, I would be greatly appreciated if you know how to fix the above code. – qaz Mar 10 '15 at 17:30

1 Answers1

0

The problem has to do with the comparison of dissimilar types and can be demonstrated with a simpler script:

import numpy as np
mytuple = ('a', 'a')
myarray = np.array(mytuple)
[myarray].index(mytuple)

When run, you get:

$ python hem.py

Traceback (most recent call last):
  File "hem.py", line 4, in <module>
    [myarray].index(mytuple)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

population.values() is a list of tuples but pair[0] is an np array, so they will never compare properly. The error itself is a confusing, but you'll need a different way to find the index.

tdelaney
  • 73,364
  • 6
  • 83
  • 116