3

Can somebody explain data frame joins with pandas to me based on this example?

The first dataframe, let's call it A, looks like this:

enter image description here

The second dataframe, B, looks like this:

enter image description here

I want to create a plot now in which I compare the values for column running in A with those in B but only if the string in column graph is the same. (In this example, the first row in A and B have the same graph so I want to compare their running value.)

I believe this is what Pandas.DataFrame.join is for, but I cannot formulate the code needed to join the data frames A and B correctly.

clstaudt
  • 21,436
  • 45
  • 156
  • 239

1 Answers1

5

I think I would use merge here:

>>> a = pd.DataFrame({"graph": ["as-22july06", "belgium", "cage15"], "running": [2, 879, 4292], "mod": [0.28, 0.94, 0.66], "eps": [220, 176, 1096]})
>>> b = pd.DataFrame({"graph": ["as-22july06", "astro-ph", "cage15"], "running": [395.186, 714.542, 999], "mod": [0.67, 0.74, 0.999]})
>>> a
    eps        graph   mod  running
0   220  as-22july06  0.28        2
1   176      belgium  0.94      879
2  1096       cage15  0.66     4292
>>> b
         graph    mod  running
0  as-22july06  0.670  395.186
1     astro-ph  0.740  714.542
2       cage15  0.999  999.000
>>> a.merge(b, on="graph")
    eps        graph  mod_x  running_x  mod_y  running_y
0   220  as-22july06   0.28          2  0.670    395.186
1  1096       cage15   0.66       4292  0.999    999.000
DSM
  • 342,061
  • 65
  • 592
  • 494