0

I am currently having a database with a set of columns X. I am willing to "update" , using .loc (or .iloc) the content of a row for a certain subset of columns (which we can call Y) from X, but the updated row is then filled with NaN and i'm trying to understand why.

a = pd.DataFrame({'id': [1, 2, 10, 12],
     'val1': ['a', 'b', 'c', 'd'],
     'val2': ['e', 'f', 'g', 'h']})

my_row = pd.DataFrame({'id': [7],
     'val1': ['z']})

index = a[a.id == 2].index

a.loc[index, ['id','val1']] = my_row

I also tried:

a .iloc[index, Y_index] = row

with Y_index containing the index of ['id','val1'], my_row is a Dataframe with the "new content" I want to assign and contains only the columns in Y.

But even though both doesn't return an error, the updated row is then filled with NaN.

I have tried to assign a single value (like an int and not a DataFrame) and it worked fine. I therefore think there is a way to assign to each column its corresponding value but I cannot find how. Does anyone has an idea ?

EDIT : It seems to be something related to index; If i change my code for this :

index = a[a.id == 1].index

Then the operation is a success. The only difference i am seeing in this in this case, my_row and a.loc[index, ['id','val1']] have the exact same index But this doesn't really help me understanding why and how this is happening

pelos
  • 3
  • 2
  • Please update your post to include a [minimal reproducible example](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – not_speshal Jul 26 '23 at 13:04
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jul 26 '23 at 13:05
  • Try using .at instead https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.at.html – Nev1111 Jul 26 '23 at 13:11
  • @Nev1111 I'm afraid this isn't what I am looking for : .at is to access a single value, while i'm trying to change multiple columns – pelos Jul 26 '23 at 13:25

1 Answers1

1

Update the index of the new row to match the original index and then use loc:

a.loc[index, my_row.columns] = my_row.set_index(index)

>>> a
   id val1 val2
0   1    a    e
1   7    z    f
2  10    c    g
3  12    d    h
not_speshal
  • 22,093
  • 2
  • 15
  • 30
  • thanks for the idea, indeed this is working. But i'm curious still : why doesn't it work if they don't have the same index ? At first glance it doesn't seem necessary to me ... – pelos Jul 26 '23 at 13:41
  • Note that you could directly do `a.update(my_row.set_index(index))` ;) – mozway Jul 26 '23 at 13:57