Pandas Dataframe For Loop

Question

I want to replace a column in a pandas dataframe with a portion of another column. What I have so far is:

for index, row in df.iterrows():
  File = df.at[row, 'FileName']
  df.at[row, 'NUMBER'] = File.split(".")[1]

Ideally, this will iterate through rows of the dataframe and replace the number column with a portion of the FileName column

I am getting the error:

ValueError: At based indexing on an integer index can only have integer indexers

and I think it has to do with the misuse of df.at[], but I am not sure how to fix it.

jezrael · Accepted Answer · 2018-08-23T13:58:36.970

3

Dont loop by iterrows because slow, better is use str.split with selecting second lists by indexing:

df['NUMBER'] = df['FileName'].str.split(".").str[1]

Or use list comprehension if need better performance:

df['NUMBER'] = [x.split(".")[1] for x in df['FileName']]

edited Aug 23 '18 at 13:58

answered Aug 23 '18 at 13:49

jezrael

822,522
95
1,334
1,252

1

Your solution is objectively better than what I was doing. Thanks so much for your help! – Joe S Aug 23 '18 at 14:05

score 1 · Answer 2 · answered Aug 23 '18 at 13:58

In case you are wondering about error

change df.at[row, 'NUMBER'] to df.at[index, 'NUMBER'] it should be index instead of row which is whole dataframe

it should be like this

for index, row in df.iterrows():

  df.at[index, 'NUMBER'] = row['FileName'].split(".")[1]

for more info

I prefer jezrael's answer for solution

score 0 · Answer 3 · answered Aug 23 '18 at 13:53

I believe what you are looking for is "split" in combination with "expand=True". Working example:

import pandas as pd
col_1 = ['abc', 'abc', 'bcd', 'bcd']
col_2 = ['james.25', 'jane.23', 'andrew.15', 'jim.22']
data = pd.DataFrame({'NUMBER': col_1, 'FileName': col_2})

data['NUMBER'] = data['FileName'].str.split('.', expand=True)[1]

Pandas Dataframe For Loop

3 Answers3