In the following code, I am attempting to calculate both the frequency and sum of a set of vectors (numpy vectors)
def calculate_means_on(the_labels, the_data):
freq = dict();
sums = dict();
means = dict();
total = 0;
for index, a_label in enumerate(the_labels):
this_data = the_data[index];
if a_label not in freq:
freq[a_label] = 1;
sums[a_label] = this_data;
else:
freq[a_label] += 1;
sums[a_label] += this_data;
Suppose the_data
(a numpy 'matrix') is originally :
[[ 1. 2. 4.]
[ 1. 2. 4.]
[ 2. 1. 1.]
[ 2. 1. 1.]
[ 1. 1. 1.]]
After running the above code, the_data
becomes:
[[ 3. 6. 12.]
[ 1. 2. 4.]
[ 7. 4. 4.]
[ 2. 1. 1.]
[ 1. 1. 1.]]
Why is this? I've deduced it down to the line sums[a_label] += this_data;
as when i change it to sums[a_label] = sums[a_label] + this_data;
it behaves as expected; i.e., the_data
is not modified.