7

Apologize if this has been asked before, somehow I am not able to find the answer to this.

Let's say I have two lists of values:

rows = [0,1,2]
cols = [0,2,3]

that represents indexes of rows and columns respectively. The two lists combined signified sort of coordinates in the matrix, i.e (0,0), (1,2), (2,3).

I would like to use those coordinates to change specific cells of the dataframe without using a loop.

In numpy, this is trivial:

data = np.ones((4,4))
data[rows, cols] = np.nan

array([[nan,  1.,  1.,  1.],
      [ 1.,  1., nan,  1.],
      [ 1.,  1.,  1., nan],
      [ 1.,  1.,  1.,  1.]])

But in pandas, it seems I am stuck with a loop:

df = pd.DataFrame(np.ones((4,4)))
for _r, _c in zip(rows, cols): 
    df.iat[_r, _c] = np.nan

Is there a way to use to vectors that lists coordinate-like index to directly modify cells in pandas?


Please note that the answer is not to use iloc instead, this selects the intersection of entire rows and columns.

toto_tico
  • 17,977
  • 9
  • 97
  • 116
  • Ummm, seems like you are using the right way, I am not sure whether we have alternative solution for this question . – BENY Aug 21 '18 at 13:48

1 Answers1

5

Very simple! Exploit the fact that pandas is built on top of numpy and use DataFrame.values

df.values[rows, cols] = np.nan

Output:

     0    1    2    3
0  NaN  1.0  1.0  1.0
1  1.0  1.0  NaN  1.0
2  1.0  1.0  1.0  NaN
3  1.0  1.0  1.0  1.0
Yuca
  • 6,010
  • 3
  • 22
  • 42
  • 1
    this is `numpy` as well buddy . – BENY Aug 21 '18 at 13:48
  • well pandas is on itself numpy no? – Yuca Aug 21 '18 at 13:49
  • thanks a lot, I promised I tried this, but somehow I made a mess with the indexes and ended doing it wrong, now it works!!. My real case is obviously more convoluted and endend doing something that I realize now is silly like this `models.loc[someselector,:].values[range(_idxmin.shape[0]), _idxmin] = np.nan`, it was a matter of rethinking my indexes. – toto_tico Aug 21 '18 at 14:00
  • 1
    @Yuca, pandas uses numpy in the background, but in the last couple of days I am learning (the hard way) that there are important differences when you are trying to create very efficient code, even for things that look [very similar](https://stackoverflow.com/questions/51932428/obtain-min-and-idxmin-or-max-and-idxmax-at-the-same-time-simultaneou/51932429#51932429). – toto_tico Aug 21 '18 at 14:01
  • @toto_tico thank you for sharing your experience, I agree, the devil is in the details! – Yuca Aug 21 '18 at 14:05