3

I have a pandas dataframe. I am trying to modify name column value in the last row

I try

df.loc[-1,'name'] = "something"

this works

Now I filter few rows from the df with a query and call it df_query

and my last row in df_query is

    id  name
21  965 kris

I check the index -1

df_query.loc['name'].iloc[-1]

it shows "kris"

now on df_query i try

df_query.loc[-1,'name'] = "something"

it adds an extra row instead of replacing kris with something

    id  name
21  965.0 kris
-1  NaN "something"

also convers id into float from int

why sometimes it works and sometimes it doesnt

later after searching i found at https://stackoverflow.com/a/49510469

Just using iloc[-1, 'a] won't work as -1 is not in the index.

I couldnt understand the reason given above

and says to try:

df_query.loc[df_query.loc.index[-1],'name'] = "something"

and now it works.

Can someone explain whats happening

Santhosh
  • 9,965
  • 20
  • 103
  • 243

2 Answers2

5

You can select last value of name different way - if use DataFrame.loc use df.index for last value of index if index values are unique:

df.loc[df.index[-1],'name'] = "something"

Or if use DataFrame.iloc get position of column name by Index.get_loc:

df.iloc[-1,df.columns.get_loc('name')] = "something"

If use:

df.loc[-1,'name'] = "something"

Pandas try set row with index=-1 if exist, else create new row with index -1. Problem is if last index has no -1, but e.g. first index, it replace not last, but first row.

So is possible use:

#tested last value of index
if df.index[-1] == -1:
    #last value is set
    df.loc[-1,'name'] = "something"

#tested all values if index
elif (df.index == -1).any():
    #some value with -1 is set
    df.loc[-1,'name'] = "something"
else:
    #new row with -1 is created
    df.loc[-1,'name'] = "something"
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • `Pandas try set row with index=-1 if exist, else create new row with index -1.` then why sometimes it works. How to know if -1 index exists – Santhosh Feb 08 '21 at 06:26
  • @Santhosh - I think for prevent this problem use first or second solution. You can test -1, but Ithink it is more complicated. Also one problem is if use `df.loc[-1,'name'] = "something"` - it also replace first row if there is `index=-1` – jezrael Feb 08 '21 at 06:28
  • I am ok if first is the last row (if dataframe contains only one row), but i want to know why its creating new row, and why not it recognizes -1. how do we know that in some dataframes -1 index exists and some does not. – Santhosh Feb 08 '21 at 06:32
  • also I check the index `-1` by `df_query['name'].iloc[-1]` it shows `kris`. Then how that last index has no` -1`. – Santhosh Feb 08 '21 at 06:37
  • `but Ithink it is more complicated.` can explain the complexity – Santhosh Feb 08 '21 at 06:38
  • @Santhosh - I try modify answer with explain 3 possible outputs depends of `-1` – jezrael Feb 08 '21 at 06:39
  • one more thing i want to understand is then how come `df_query['name'].iloc[-1]` says `kris` – Santhosh Feb 08 '21 at 06:45
  • @Santhosh - hmmm, `iloc` test by position, not by index values like `loc`. So it not create new row. – jezrael Feb 08 '21 at 06:47
  • i am confused what is index in case of loc and index incase of iloc – Santhosh Feb 08 '21 at 16:09
  • 1
    @Santhosh if use loc it select by label, so here by - 1, or 21 in sample data. if use iloc there are labels not important to know, only positions. So 0 means for iloc first row, in sample data row with index=21. And similar - 1 means last row, labels should be 1000, 21, -1, 20010-01-01, some string. Any value, because fir iloc not imporatant. – jezrael Feb 08 '21 at 16:18
  • 1
    thank you. Now you clarified me, So basically in loc is just looking for a literal string (nothing to do with position) and iloc (it looks by position). Same answer found in : [pandas.DataFrame.loc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html#pandas.DataFrame.loc) `A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).` – Santhosh Feb 08 '21 at 16:37
1

You can also use df.tail to pick the last row of dataframe and then replace the value of name column with something:

df_query.tail(1)['name'] = 'something'

Example:

In [629]: df = pd.read_clipboard()

In [630]: df
Out[630]: 
     id  name
21  965  kris

In [631]: df.tail(1)['name'] = 'something'

In [632]: df
Out[632]: 
     id       name
21  965  something
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58