0

I am trying to loop through a pandas dataframe and every time that a specific string appears, it will set the value for the cell on the same row but 2 columns previous, to the value of the cell one row before the specific text. I have attached a visual explanation of what I am trying to do if that does not make sense. Visual Example

Code:

for index, row in df.iterrows():
if row[3] == 'National Account Job Coordinator':
    row[1] = df.iloc[index-1, 3]
else:
    continue

The code will print out the correct values, but does not set the dataframe values... Any ideas? Thanks

johankent30
  • 65
  • 2
  • 3
  • 11
  • Can you clarify your need? I know you posted a picture, and I can assume what you need. but the phrases `it will set the value for the cell on the *same* row` and `to the value of the cell *one row before*` seems a little confusing and contradictory – MattR Jan 23 '18 at 21:06
  • use `.loc` or `.iloc` to assign values. `iterrows` will just create copies that will be discarded. – Paul H Jan 23 '18 at 21:07

1 Answers1

0

You should using iloc or .loc for subseting your Dataframe and then access that subset's column to modify it. Here's another StackOverflow answer that gives a good explanation of iloc and .loc pandas iloc vs ix vs loc explanation?

Here's a simple example taking the information from your spreadsheet and sticking it in a DataFrame.

>>> df = pd.DataFrame(data, columns=['level_0', 'Unnamed: 1', 'Unnamed: 2', 
'Unnamed: 3'])
>>> df
   level_0  Unnamed: 1  Unnamed: 2                        Unnamed: 3
0      NaN         NaN         NaN                        Greensboro
1      NaN         NaN         NaN  National Account Job Coordinator
>>> df['Unnamed: 3']
0                          Greensboro
1    National Account Job Coordinator
Name: Unnamed: 3, dtype: object
>>> df.loc[df['Unnamed: 3'] == 'National Account Job Coordinator', 'Unnamed: 
1'] = 'Greensboro'
>>> df
    level_0  Unnamed: 1  Unnamed: 2                        Unnamed: 3
0      NaN         NaN         NaN                        Greensboro
1      NaN  Greensboro         NaN  National Account Job Coordinator
Orenshi
  • 1,773
  • 11
  • 12
  • Okay this get's me very close. The issue is that 'Greensboro' may not always be the value in the row above. How can I assign the value from the previous row? – johankent30 Jan 23 '18 at 21:27
  • Figured it out. Thanks! – johankent30 Jan 23 '18 at 21:34
  • Are you programmatically going to set a whole DataFrame's column based on the previous row's colume? If think in the quite literal sense of your request, you can set the value by doing something like `df.shift(1).iloc[-1]['Unnamed: 3']` -- edit -- Oh you already got it! Nevermind :) – Orenshi Jan 23 '18 at 21:36