-1

I have a dataframe (df1) of 5 columns (a,b,c,d,e) with 6 rows and another dataframe (df2) with 2 columns (a,z) with 20000 rows.

How do I map and merge those dataframes using ('a') value. So that df1 having 5 columns should map values in df2 having 2 columns with 'a' value and return a new df which has 6 columns (5 from df1 and 1 mapped row in df2) with 6 rows.

DirtyBit
  • 16,613
  • 4
  • 34
  • 55
  • 1
    Welcome to SO. Have a look at https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – meW Jan 08 '19 at 07:03
  • Can you confirm that 'a' is a unique identifier? – nick Jan 08 '19 at 07:03
  • Go to https://stackoverflow.com/questions/53645882/pandas-merging-101 and search for "Merging only a single column from one of the DataFrames". – cs95 Jan 08 '19 at 07:05
  • yes 'a' is present in both the df1 and df2 . we have to map values using a –  Jan 08 '19 at 07:14

1 Answers1

1

By using pd.concat:

import pandas as pd
import numpy as np

columns_df1 = ['a','b','c','d']
columns_df2 = ['a','z']
data_df1 = [['abc','def','ghi','xyz'],['abc2','def2','ghi2','xyz2'],['abc3','def3','ghi3','xyz3'],['abc4','def4','ghi4','xyz4']]
data_df2 = [['a','z'],['a2','z2']]

df_1 = pd.DataFrame(data_df1, columns=columns_df1)
df_2 = pd.DataFrame(data_df2, columns=columns_df2)
print(df_1)
print(df_2)

frames = [df_1, df_2]

print (pd.concat(frames))

OUTPUT: out

Edit:

To replace NaN values you could use pandas.DataFrame.fillna:

print (pd.concat(frames).fillna("NULL"))

Replcae NULL with anything you want e.g. 0

OUTPUT:

out-2

DirtyBit
  • 16,613
  • 4
  • 34
  • 55
  • What is your IDE? Is it pycharm or something else? – jezrael Jan 08 '19 at 07:52
  • 1
    @jezrael yes it is, just login with Google/Github and boom! > https://pyfiddle.io/fiddle/5413dc5a-3d3b-4347-b88e-f8279c31c8e4/?m=Saved%20fiddle – DirtyBit Jan 08 '19 at 08:00