0

Having the following DF :

Name_A Name_B A B Type_A Type_B
Test Test 5 5 Game Game
nan Test nan 5 nan Game
nan Test nan 10 nan Game

doing :

DF.groupby(['Name_A',"Type_A"], as_index=False)[['A','B']].sum()
NewDF = DF.where(DF['TypeA']== "Game")

returns the following

Name_A Name_B A B Type_A Type_B
Test nan 5 5 Game Game

But I would like to get :

Name_A Name_B A B Type_A Type_B
Test Test 5 20 Game Game

Is it possible to do so ?

Maybe I need to use merge instead of group by? The answer might be close to this one but I need to use a different group by.

Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
TourEiffel
  • 4,034
  • 2
  • 16
  • 45
  • 1
    @ThePyGuy I edited the post it is `nan` not sure if it is the same than `NaN` – TourEiffel Sep 19 '22 at 16:13
  • 1
    There are various ways to obtain that desired output. Where do OP wants `Type_B` `Game` to come from? Is there any special requirement? – Gonçalo Peres Sep 19 '22 at 19:07
  • 1
    @GonçaloPeres it comes from sum of `B` where `Names_B = Test`. The only requirement in this example is to sum `B` Where `Names_B = Test` and `Type_B = Game` and `A` Where `Names_A = Test` and `Type_A = Game` – TourEiffel Sep 20 '22 at 07:19

1 Answers1

1

Given OP new requirement, achieve that using numpy.where, pandas.DataFrame.astype (to be able to handle nan values) and .sum as follows

df['A'] = np.where((df['Name_A'] == 'Test') & (df['Type_A'] == 'Game'), df['A'].astype(float).sum(), df['A'])

df['B'] = np.where((df['Name_B'] == 'Test') & (df['Type_B'] == 'Game'), df['B'].astype(float).sum(), df['B'])

[Out]:
  Name_A Name_B    A     B Type_A Type_B
0   Test   Test  5.0  20.0   Game   Game
1    nan   Test  nan  20.0    nan   Game
2    nan   Test  nan  20.0    nan   Game

Then, considering that OP only wants to retrieve the first line, one can use pandas.DataFrame.iloc

df = df.iloc[0:1]

[Out]:
  Name_A Name_B    A     B Type_A Type_B
0   Test   Test  5.0  20.0   Game   Game

One can wrap all those operations in a function, let's call it gamefunc as follows

def gamefunc(df):

    df['A'] = np.where((df['Name_A'] == 'Test') & (df['Type_A'] == 'Game'), df['A'].astype(float).sum(), df['A'])

    df['B'] = np.where((df['Name_B'] == 'Test') & (df['Type_B'] == 'Game'), df['B'].astype(float).sum(), df['B'])

    df = df.iloc[0:1]

    return df

And then all one has to do is to apply the function to the dataframe

df = gamefunc(df)

[Out]:
  Name_A Name_B    A     B Type_A Type_B
0   Test   Test  5.0  20.0   Game   Game
Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83