0
Sample Data:
df1 = pd.DataFrame([
                        {'ID':'0001', 'Term':'Fall', 'Amount':1000},
                        {'ID':'0002', 'Term':'Fall', 'Amount':2000},
                        {'ID':'0001', 'Term':'Spring', 'Amount':50},
                        {'ID':'0002', 'Term':'Spring', 'Amount':50},
                        {'ID':'0001', 'Term':'Summer', 'Amount':200},
                        {'ID':'0002', 'Term':'Summer', 'Amount':200},
                    ])
Output:

    ID      Term    Amount
0   0001    Fall    1000
1   0002    Fall    2000
2   0001    Spring  50
3   0002    Spring  50
4   0001    Summer  200
5   0002    Summer  200

I want to update the Amount column for every row where the Term is 'Spring'. The Amount column should be set to the Amount value that corresponds to the same ID but for the 'Fall' term.

Any guidance would be appreciated.

Desired output:

    ID      Term    Amount
0   0001    Fall    1000
1   0002    Fall    2000
2   0001    Spring  1000
3   0002    Spring  2000
4   0001    Summer  200
5   0002    Summer  200

3 Answers3

1

Here is one method, where we pivot each term into a column, update just the Spring column, then melt back to the original structure.

df1 = df1.pivot_table("Amount", "ID", "Term").reset_index()
df1["Spring"] = df1["Fall"]
df1 = df1.melt("ID", value_name="Amount")

Here's another method that should work even if there are other columns:

df_spring = df1.loc[df1["Term"] == "Spring", ["ID", "Term"]]
df_fall = df1.loc[df1["Term"] == "Fall", ["ID", "Amount"]]

df_spring = df_spring.reset_index().merge(df_fall).set_index("index")
df1.update(df_spring)
C. Braun
  • 5,061
  • 19
  • 47
0

You can do it as follows:

import pandas as pd

df1 = pd.DataFrame([
                        {'ID':'0001', 'Term':'Fall', 'Amount':1000},
                        {'ID':'0002', 'Term':'Fall', 'Amount':2000},
                        {'ID':'0001', 'Term':'Spring', 'Amount':50},
                        {'ID':'0002', 'Term':'Spring', 'Amount':50},

                    ])

number_rows = list(range(len(df1.index)))
print(number_rows)

for n in number_rows:
    if df1.at[n, "Term"] == "Spring":
        for i in number_rows:
            if df1.at[i, "ID"] == df1.at[n, "ID"]:
                df1.at[n, "Amount"] = df1.at[i, "Amount"]

print(df1)

Output:

     ID    Term  Amount
0  0001    Fall    1000
1  0002    Fall    2000
2  0001  Spring    1000
3  0002  Spring    2000
Dennis
  • 79
  • 1
  • 8
  • 1
    this approach is an anti-pattern and should be avoided. See the second answer [here](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas) for more. – C. Braun Sep 07 '20 at 18:45
0

you can use map the column ID where spring, once selected the rows with Fall and set_index the ID.

mSpring = df1['Term'].eq('Spring')
df1.loc[mSpring, 'Amount'] = (df1.loc[mSpring, 'ID']
                                 .map(df1.loc[df1['Term'].eq('Fall')]
                                         .set_index('ID')['Amount'])
                             )
print(df1)
     ID    Term  Amount
0  0001    Fall    1000
1  0002    Fall    2000
2  0001  Spring    1000
3  0002  Spring    2000
4  0001  Summer     200
5  0002  Summer     200
Ben.T
  • 29,160
  • 6
  • 32
  • 54