0

I have a pandas df, and would like to calculate all the possible differences between the values of a certain column while retaining the indexes of the rows that generated each difference value.

To my python newbie mind, the most reasonable way to do so appears to be the following:

  • create a function that locates all the values taking part in the computation and compute the differences
  • have the function return three lists: the two indexes taking part in the operation and the result
  • store these three lists in a df, as brilliantly suggested in this other thread. I am using the following code:
ind1 = []
ind2 = []
delta = []
def calculator(): # id stands for index
    for i in range(len(df)):
        for j in range(len(df)):
            v1 = df.loc[i, 'col']
            v2 = df.loc[j, 'col']
            dv = abs(v1-v2)
            delta.append(dv)
            ind1.append(i)
            ind2.append(j)
    return ind1, ind2, delta

The problem arises when constructing the new df, as I get an unpacking problem:

data = []
for ind1, ind2, delta in calculator():
    data.append([ind1, ind2, delta])
new_df = pd.DataFrame(data, columns=['ind1', 'ind2', 'delta'])

returns:

ValueError: too many values to unpack (expected 3)

Any idea on how to solve this issue, while constructing the df properly as indicated in the other thread?

  • 2
    your `for` doesnt do what you think it does. you think it takes a value out of each list, but instead its trying to take the first values form the first list! try `zip(*calculator)` – Nullman Feb 28 '22 at 11:04
  • Use "zip" to create a list of tuples from the tuple of lists returned by "calculator". – Michael Butscher Feb 28 '22 at 11:06
  • You are trying to iterate over the rows a list of tuples but you are iterating over a tuple containing three lists. So your `for` just returns the first value of the tuple (which is a list), then your `ind1, ind2, delta in` tries to assign that to three variables, but it gets more than three values and so it fails. Transpose the list as @Nullman suggests. – ewz93 Feb 28 '22 at 11:09
  • @Nullman 's metod works like a charm, thank you. I am not sure though about how the fix works, would you mind explaining a little more? – Antonio Carnevali Feb 28 '22 at 11:32

1 Answers1

1

The for does not work as you might expect. consider the following toy example:

for x,y,z in [[1,2,3], [4,5,6], [7,8,9]]:
    print(x,y,z)

you would expect the output to be:

1 4 7
2 5 8
3 6 9

but what you get is

1 2 3
4 5 6
7 8 9

this happans because the loop iterates each item in you list, which is a list on it's own and tries to expand it into your 3 parameters which may or may not exist. to transpose the list (of lists) you can use the built in zip like so

for x,y,z in zip(*[[1,2,3], [4,5,6], [7,8,9]]):
    print(x,y,z)

or in your specific case:

for ind1, ind2, delta in zip(*calculator()):
Nullman
  • 4,179
  • 2
  • 14
  • 30