1

Trying to return a summed value based on keys from a second DataFrame.

For instance, take column 0 in df1:

   0
0  a
1  b
2  c

And sum the matching values from column 1 in df2:

   0  1
0  a  4
1  b  3
2  c  2
3  d  1
4  e  7

to return a value of 9. Values in column 0 in both DataFrames are non-repeating.

So far, other than a simple (and not at all Pandas-y) for loop, my best attempt is result = df2.loc[(df2[0]==df1[0]), df2[1]].sum(), which returns a "Can only compare identically-labeled Series objects" error.

I expect this is straightforward for those with more Pandas experience than I, but I'm Pooh-trying-to-leave-Rabbit's-house level stuck. Thanks for any help!

Charlie
  • 11
  • 2
  • Welcome to SO! Nearly there, try: `df2.loc[df2[0].isin(df1[0]), 1].sum()`. See: [`Series.isin`](https://pandas.pydata.org/docs/reference/api/pandas.Series.isin.html). For [`df.loc`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html) just pass the *label* (i.e. `1`, not `df2[1]`). – ouroboros1 Mar 07 '23 at 18:31
  • @ouroboros1 maybe I misunderstood what OP wants (I thought thet wanted to map values "*sum the matching values from column 1 in df2*"), in this case they maybe just want a groupby? I can reopen if you want – mozway Mar 07 '23 at 18:36
  • 1
    @ouroboros1 fair enough, I changed the duplicate – mozway Mar 07 '23 at 18:40
  • 2
    Correct; the goal was to return a single value. I'm new enough to Pandas, though, that I did also appreciate the new rabbit hole to go down, @mozway; thanks much for that answer. Much appreciated! And heartening to see that I was at least in the ballpark. – Charlie Mar 07 '23 at 18:44

0 Answers0