0

I have a pandas df like this.

         up_value        valsup
0     59044.21272   59044.21272
1     59040.68568   59158.53136
2     59044.21272   59279.91816
3     59040.69570   59394.23280
4     59044.22274   59515.63370
...           ...           ...
6081  58917.07896  774036.35472
6082  58917.07896  774153.95368
6083  58917.08898  774271.68432
6084  58917.07896  774389.15160
6085  58917.08898  774506.88228

                 

I'm trying to use numpy argwhere and create a new pandas column like this.

df["idx_up"] = np.argwhere(df["valsup"].values > df["up_value"].values)

But it returns the following error.

ValueError: Length of values (6085) does not match length of index (6086)

When I do, print(np.argwhere(df["valsup"].values > df["up_value"].values)), the output looks like this.

[[   1]
 [   2]
 [   3]
 ...
 [6083]
 [6084]
 [6085]]

So it seems like np.argwhere only returns 6085 values instead of 6086.

I wanna assign the output to pandas. Can someone tell me how to fix the error?

Thanks

John
  • 129
  • 12
  • What value do you want to assign? **True** and **False**? – Lazyer Jun 18 '22 at 14:22
  • @Lazyer This is what I wanna do. `idx_up = np.argwhere(df["valsup"].values > df["up_value"].values)` and then `idx_up = idx_up[0][0] if len(idx_up) else -1`. I need to assign the latter part to the datafarme – John Jun 18 '22 at 14:24
  • Then, **idx_up** column means **index if valsup>up_value else -1** ? – Lazyer Jun 18 '22 at 14:32
  • @Lazyer I'm trying to replicate this answer. https://stackoverflow.com/a/72337982/18201044 – John Jun 18 '22 at 14:35
  • @Lazyer Brilliant. That's what I wanted. Thanks for the help. – John Jun 18 '22 at 14:49

1 Answers1

1

At the code from one of the answers at that url,

idx_up = idx_up[0][0] if len(idx_up) else -1

this code checks only idx_up at index 0.

You should add column first like

df['idx_up'] = -1

and update like

df['idx_up'].iloc[[x[0] for x in idx_up]] = [x[0] for x in idx_up]
Lazyer
  • 917
  • 1
  • 6
  • 16