1

I am trying to update/modify certain part of data frame based on another column's value.

If column ['a'] is null, fill column ['a'] with value of column ['b'] like below

list_position = [[4, 35]]
df.iloc[list_position[0][0]:list_position[0][1] + 1,:]['a'] = df.iloc[list_position[0][0]:list_position[0][1] + 1,:].apply(lambda row: row['a'] * row['b'] if np.isnan(row['a']) else row['b'], axis=1)

It is giving error as TypeError: an integer is required.

Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

Any suggestion to correct the same is highly appreciated.

Update 1. I tried all three ways as suggested in 1 duplicate answer

df['Cat1'].fillna(df['Cat2'])    

and

2 answers suggested on this post.

1. df['a'][df['a'].isnull()] = df['b']
2. df['a'] = df['a'].fillna(df['b'])

All are giving same error as:

Traceback (most recent call last):
  File "pandas\_libs\index.pyx", line 162, in 
pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 958, in 
pandas._libs.hashtable.Int64HashTable.get_item
TypeError: an integer is required

It is working if I replace column name with column number like

df[7] = df[7].fillna(df[8)

Not sure why, if any one has explanation for same.

Manvi
  • 1,136
  • 2
  • 18
  • 41

2 Answers2

0

This should work in your case

df['a'][df['a'].isnull()] = df['b']
Poojan
  • 3,366
  • 2
  • 17
  • 33
  • I don't think so. Test out `print(np.nan==np.nan)` – pault Jan 29 '19 at 21:10
  • Traceback (most recent call last): File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item TypeError: an integer is required Error with this statement. – Manvi Jan 30 '19 at 13:41
0

I can see the logic that you are trying to use to complete the task, but there is a much easier way to do it.

df['a'] = df['a'].fillna(df['b'])

This will fill the null values in column a with the values in column b on the same index. However if column b has null values and column a has null values column a will also have null.

Edeki Okoh
  • 1,786
  • 15
  • 27
  • 1
    TypeError: an integer is required same error with this statement. – Manvi Jan 30 '19 at 13:39
  • Can you post your code in the comment? I think it has something to do with the quotations making the compiler believe that there are strings being passed through but I can't tell without seeing how you are actually using the code above. Tested it on my code earlier and it worked – Edeki Okoh Jan 30 '19 at 15:48
  • temp_df['ESTIMATED_TIME'] = temp_df['ESTIMATED_TIME'].fillna(temp_df['IF_ARRIVED']) – Manvi Jan 30 '19 at 15:52
  • I am exacly using same line you suggested, just replacing the name of df and columns – Manvi Jan 30 '19 at 15:53
  • Traceback (most recent call last): File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item TypeError: an integer is required – Manvi Jan 30 '19 at 15:54
  • Can you check the datatype of estimated time and if arrived and post them here? Like this: temp_df.dtypes – Edeki Okoh Jan 30 '19 at 15:58
  • Its empty after 'Like this :' do you want to suggest how to see datatype? – Manvi Jan 30 '19 at 16:03
  • temp_df.dtypes is what it should say. Shows that on my comment, weird – Edeki Okoh Jan 30 '19 at 16:04
  • 0 object 1 object 2 object 3 object 4 object 5 object 6 object 7 object 8 object 9 object 10 object 11 object 12 float64 13 object dtype: object – Manvi Jan 30 '19 at 16:09
  • It is returning object. But I am receiving it in as argument like temp_df : pd.DataFrame. – Manvi Jan 30 '19 at 16:10
  • Click the link to talk to me in chat. I know the issue but its better to use chat for extended discussions. [continue this discussion in chat](https://chat.stackoverflow.com/rooms/187590/discussion-between-edeki-okoh-and-manvi). – Edeki Okoh Jan 30 '19 at 16:12
  • 1
    strangely same statement is working if I replace column name with column number. df[7] = df[7].fillna(df[8) – Manvi Jan 30 '19 at 18:06
  • Thanks for all your help. Can you please suggest which panda version are you using. I suspect its the difference in version. – Manvi Jan 30 '19 at 20:45
  • I am using '0.23.4' – Edeki Okoh Jan 30 '19 at 21:19
  • OK. I have same version. what do you get with print(df.columns.values.tolist()). I get column numbers like this [0, 1, 2, 3, 4, 5, 6, 7] instead of name.. – Manvi Jan 31 '19 at 14:32
  • That just means you haven't given your column headers any names yet. – Edeki Okoh Jan 31 '19 at 15:34