Merge groupby df with repeated index df, drop non-matching rows

Question

I want to merge the following df with the series_object below it (details below them):

df1

       wallet       cumulative_rewards  position      position_rewards  position_type
0      0x12345                       0      LUSD      1000              LP Pair
0      0x12345                       0      TOKE      200               LP Token
1      0xabcde                       0    UNI_LP      500               LP
1      0xabcde                       0  SUSHI_LP      0                 LP
1      0xabcde                       0       DAI      100               LP Pair

series_object: grouped_toke_transfers = toke_transfers_df.groupby('from_address')['value'].sum(). Where the index is from_addressand the values of the series_object are the values.

from_address   value
0x12345        13687137402763990447827
0xabcde        58950104860666120622
0xfghij        3491287228881431880579
0xklmno        1260986666816869789

Merged df details:

Keep the wallets from df1(repeated indexes/wallets);
Add value from df2to each corresponding wallet (same value per wallet should appear on each row for that wallet);
Merged dfshould not include wallets from df1 where that wallet does not exist in df2.

Bonus: What I want to do, ultimately, is query the merged table for value (from df2) based on the position_type(from df1). how would I go about doing this grouopby()?

jezrael · Answer 1 · 2022-09-08T11:01:23.137

Use:

print (df1)
    wallet  cumulative_rewards  position  position_rewards position_type
0  0x12345                   0      LUSD              1000       LP Pair
0  0x12345                   0      TOKE               200      LP Token
1  0xabcde                   0    UNI_LP               500            LP
1  0xabcde                   0  SUSHI_LP                 0            LP
1      eee                   0       DAI               100       LP Pair

Map solution - create new column value with set missing values by Series.where:

grouped_toke_transfers = toke_transfers_df.groupby('from_address')['value'].sum()
df1['value'] = df1['wallet'].map(grouped_toke_transfers)
df1['wallet'] = df1['wallet'].mask(df1['value'].isna())

print (df1)
    wallet  cumulative_rewards  position  position_rewards position_type  \
0  0x12345                   0      LUSD              1000       LP Pair   
0  0x12345                   0      TOKE               200      LP Token   
1  0xabcde                   0    UNI_LP               500            LP   
1  0xabcde                   0  SUSHI_LP                 0            LP   
1      NaN                   0       DAI               100       LP Pair   

                     value  
0  13687137402763990447827  
0  13687137402763990447827  
1     58950104860666120622  
1     58950104860666120622  
1                      NaN

Actually, I apologize, I had a mistake in terms of the data tyoe of `df2`. I updated the question to show the actual data type now. It is a `Series`object where the `index` is `from_address`, so `from_address` is not a column. That's why your solution wasn't working for me. — Luiz Scheuer, Sep 08 '22 at 11:00
What it does now is it is giving me `NaNs` for both the `wallet` and `value` columns (although the index is correct). I'm also checking here if there is anything wrong from my side. — Luiz Scheuer, Sep 08 '22 at 11:06

Merge groupby df with repeated index df, drop non-matching rows

1 Answers1