2

My first dataset is in the tsv format

Id  Val1    Val2
0   1   2
1   2   3
2   3   5
3   1   3

Second Dataset in the form

Id  Val3
0   3
2   5

I have to join both datasets on the Column Id to give the output as

Id  Val1    Val2    Val3
0   1   2   3
2   3   5   5

How do I achieve this in pandas?

dineshdileep
  • 715
  • 2
  • 13
  • 24
  • 3
    You can achieve that by reading the docs! Specifically on merging etc: http://pandas.pydata.org/pandas-docs/stable/merging.html – JohnE Jan 05 '17 at 15:48
  • 1
    err.. did you look at the [docs](http://pandas.pydata.org/pandas-docs/stable/merging.html#database-style-dataframe-joining-merging)? specifcally `pd.merge(lhs, rhs, on='Id', how='inner')`? – EdChum Jan 05 '17 at 15:48
  • @JohnE Thankfully, there's a [doc-style dupe](https://stackoverflow.com/questions/53645882/pandas-merging-101) you can use to close questions like these now. – cs95 Dec 08 '18 at 06:11
  • @coldspeed ah, thanks, good to know (although I'm not very active here nowadays but I bookmarked it) – JohnE Dec 08 '18 at 15:50

2 Answers2

3

Have a look at df.merge

You probably want to set df.merge(df2, on='Id', how='inner')

BartDur
  • 1,086
  • 1
  • 12
  • 21
0

In pandas you JOIN using the MERGE function.

With your initial dataframes:

df1 = pd.DataFrame({"Id": [0,1,2,3], "Val1": [1,2,3,1], "Val2": [2,3,5,3]})
df2 = pd.DataFrame({"Id": [0,2], "Val3": [3,5]})

You do the inner join with pd.merge():

df = pd.merge(left=df1, right=df2, how='inner')

df
Id  Val1    Val2    Val3
0   1   2   3
2   3   5   5
queise
  • 2,286
  • 21
  • 28