0

suppose I have two dataframe
df1 : col1 col2 col3
df2 : col1 col2 col4

I would like to join two dataframe using col1 and col2 without defining a new alias table name.

I don't want to do

df=df1.join(df2,(df1.col1 == df2.col1) & (df1.col2 == df2.col2) << this is so dummy And also remove the duplicated join columns after join .

so the final dataframe will have col1 col2 col3 col4 only

How to achieve that ?

mytabi
  • 639
  • 2
  • 12
  • 28

1 Answers1

0

for spark dataframe, use like below.

df3 = df1.join(df2, ['col1', 'col2'])
df3.show()
Prince Francis
  • 2,995
  • 1
  • 14
  • 22