Merge dataframes returned from applying function to DataFrame?

Question

My question is related to this question:

Merge dataframe with another dataframe created from apply function?

Here is my version of code:

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

def get_taxes_from_api(state, annual_salary):
    return pd.DataFrame({'State': [state, state], 
                         'annual.fica.amount': [int(annual_salary * 0.067),
                                                int(annual_salary * 1.067)], 
                         'annual.federal.amount': [int(annual_salary * 0.3),
                                                   int(annual_salary * 1.3)], 
                         'annual.state.amount': [int(annual_salary * 0.048),
                                                 int(annual_salary * 1.048)]})

How do I apply get_taxes_from_api to each row of df and combine the returned dataframes into on dataframe?

The only difference is that my function returns a multiple-row dataframe, not a 1-row dataframe. So the solution to that question above does not work for my situation. (And I don't have enought reputation to leave a comment there.)

score 1 · Answer 1 · answered Jun 29 '22 at 22:38

This doesn't directly answer your question, but here's one way that doesn't use apply

col = ['State','Annual Salary']
dat = [['New York', 132826], ['New Hampshire',128704], ['California',127388], ['Vermont',121599], ['Idaho',120011]]
df = pd.DataFrame(dat, columns=col)

#Create the "first" row of each state from your function by adding columns
df['annual.fica.amount'] = df['Annual Salary'].multiply(0.067)
df['annual.federal.amount'] = df['Annual Salary'].multiply(0.3)
df['annual.state.amount'] = df['Annual Salary'].multiply(0.048)

#Create the "second" row of each state as a new df
cumulative_df = df.copy()
cumulative_df['annual.fica.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.federal.amount'] += cumulative_df['Annual Salary']
cumulative_df['annual.state.amount'] += cumulative_df['Annual Salary']

#Concatenate the two tables and sort so the states are right next to each other
final_df = pd.concat((df,cumulative_df)).sort_values('State').reset_index(drop=True)

Output

Yefet · Answer 2 · 2022-06-29T23:13:30.340

You could use concat for the nested DataFrame

nested_df = df.apply(lambda x: get_taxes_from_api(x["State"],x["Annual Salary"]),axis=1)

result = pd.DataFrame()

for element in nested_df:
    result = pd.concat([result,element])

result:

print(result)

	State	annual.fica.amount	annual.federal.amount	annual.state.amount
0	New York	8899	39847	6375
1	New York	141725	172673	139201
0	New Hampshire	8623	38611	6177
1	New Hampshire	137327	167315	134881
0	California	8534	38216	6114
1	California	135922	165604	133502
0	Vermont	8147	36479	5836
1	Vermont	129746	158078	127435
0	Idaho	8040	36003	5760
1	Idaho	128051	156014	125771

score 0 · Answer 3 · answered Jul 25 '23 at 10:25

You can create a new join key among the two dfs and do pd.merge. See here:

df["df_merge_key"] = "#"
df_after_apply["df_merge_key"] = "#"
details_df = pd.merge(df, df_after_apply, how="left", on="df_merge_key").drop(labels=["df_merge_key"], axis=1)

This is simpler and neater in my opinion.

Merge dataframes returned from applying function to DataFrame?

3 Answers3