Is there a way to add a new columns in a pandas DataFrame based on equivalence of values beetwen two DataFrames?

Question

I have two DataFrames, one called df, and another called df_pag. df has the following columns:

Projetos	Ano/Volume	Unidades

On the other hand, df_pag has the following colums:

Projetos	Ano	Unidades	Paginação

These DataFrames originates from different Data Mining processes. I want to add a new column to df called 'Paginação', where its row value is pulled from df_pag if, and only if, df['Projetos'] = df_pag['Projetos'], df['Ano/Volume'] = df_pag['Ano'] and df['Unidades'] = df_pag['Unidades'].

Here is what I did:

for i in range(len(df.index)):
    for j in range(len(df_pag.index)):
        if df['Projeto'][i] == df_pag['Projeto'][j] and df['Ano/Volume'][i] == df_pag['Ano'][j] and df['Unidade'][i] == df_pag['Unidade'][j]:
            df['Paginação'][i] = df_pag['Páginação'][j]

PS. This is my first question on StackOverflow, therefore, if there is anything unclear please let me know.

Thanks for the help! It does make my code simpler and quicker. But it still returning a df without rows... — Lucas Corbanez, Mar 16 '21 at 18:23

Abhi_J · Accepted Answer · 2021-03-16T18:31:00.863

Hi what about this approach:

result_df = ((df['Projetos'] == df_pag ['Projetos']) == (df['Ano/Volume'] == df_pag ['Ano'])) == (df['Unidades'] == df_pag ['Unidades'])
df['Paginação'] = df_pag ['Paginação'][result_df]

This places NaN values for locations in Paginação column where the condition in not satisfied.

If you want any other value in place of NaN use .fillna() like:

result_df = ((df['Projetos'] == df_pag ['Projetos']) == (df['Ano/Volume'] == df_pag ['Ano'])) == (df['Unidades'] == df_pag ['Unidades'])
df['Paginação'] = df_pag ['Paginação'][result_df]
df['Paginação'] = df['Paginação'].fillna('my_value')

Is there a way to add a new columns in a pandas DataFrame based on equivalence of values beetwen two DataFrames?

1 Answers1