If this is a duplicate, please link me to the duplicate. I have not found any other posts that have answered my question.
I have a dataframe, knn_res
, with the following dimensions and data:
username Prediction is_bot
0 megliebsch 1 0 1 megliebsch 1 0
2 megliebsch 1 0
3 megliebsch 1 0
4 megliebsch 1 0
... ... ... ...
1220 ARTHCLAUDIA 1 1
1221 ARTHCLAUDIA 1 1 1222 ARTHCLAUDIA 1 1
1223 ARTHCLAUDIA 1 1
1224 ASSUNCAOWALLAS 1 1
[1225 rows x 3 columns]
What I would like to do is, for each username, count the number of predictions where prediction = 1
and where prediction = 0
and create two new columns with those values. For example using below dataset:
| username | prediction | is_bot |
|:--------:|:----------:|:------:|
| foo | 1 | 1 |
| foo | 1 | 1 |
| foo | 1 | 1 |
| foo | 0 | 1 |
| foo | 0 | 1 |
| foo1 | 0 | 1 |
| foo1 | 0 | 1 |
| foo1 | 0 | 0 |
| foo1 | 0 | 0 |
| foo1 | 1 | 0 |
| foo1 | 1 | 0 |
| foo1 | 0 | 0 |
| foo2 | 1 | 0 |
| foo2 | 1 | 0 |
| foo2 | 1 | 1 |
I would want:
| username | count_bot | count_human |
|:--------:|:----------:|:-----------:|
| foo | 3 | 2 |
| foo1 | 2 | 5 |
| foo2 | 3 | 0 |
where the following logic applies:
For each row, if
Prediction == 1
, then increase thecount_bot
counter. IfPrediction == 0
, then increase thecount_human
counter. Then, append the totals for each row and group by.
So far, I have tried referencing this question and tried the following:
knn_res['count_bot'] = knn_res[knn_res.Prediction == 1].count()
print(knn_res)
Which yields:
username Prediction is_bot count_bot
0 megliebsch 1 0 NaN
1 megliebsch 1 0 NaN
2 megliebsch 1 0 NaN
3 megliebsch 1 0 NaN
4 megliebsch 1 0 NaN
... ... ... ... ...
1220 ARTHCLAUDIA 1 1 NaN
1221 ARTHCLAUDIA 1 1 NaN
1222 ARTHCLAUDIA 1 1 NaN
1223 ARTHCLAUDIA 1 1 NaN
1224 ASSUNCAOWALLAS 1 1 NaN
Trying:
new = knn_res.groupby('username').sum()
print(new)
Yields:
Prediction is_bot
username
666STEVEROGERS 25 25
ADELE_BROCK 1 25
ADRIANAMFTTT 24 25
AHMADRADJAB 1 25
ALBERTA_HAYNESS 24 25
ALTMANBELINDA 23 25
ALVA_MC_GHEE 25 25
ANGELITHSS 25 25
ANN1EMCCONNELL 25 25
ANWARJAMIL22 25 25
AN_N_GASTON 25 25
ARONHOLDEN8 25 25
ARTHCLAUDIA 25 25
ASSUNCAOWALLAS 1 1
BECCYWILL 9 25
BELOZEROVNIKIT 17 25
BEN_SAR_GENT 1 25
BERT_HENLEY 24 25
BISHOLORINE 25 25
BLACKERTHEBERR5 11 25
BLACKTIVISTSUS 7 25
BLACK_ELEVATION 24 25
BOGDANOVAO2 7 25
BREMENBOTE 10 25
B_stever96 1 0
CALIFRONIAREP 24 25
C_dos_94 25 24
Cassidygirly 25 0
ChuckSpeaks_ 25 0
Cyabooty 0 0
DurkinSays 1 0
LSU_studyabroad 24 0
MisMonWEXP 0 0
NextLevel_Mel 25 0
PeterDuca 24 0
ShellMarcel 25 0
Sir_Fried_Alott 25 0
XavierRivera_ 0 0
ZacharyFlair 0 0
brentvarney44 1 0
cbars68 0 0
chloeschultz11 25 0
hoang_le_96 1 0
kdougherty178 25 0
lasallephilo 0 0
lovely_cunt_ 1 0
megliebsch 24 0
msimps_15 24 0
okweightlossdna 24 0
tankthe_hank 24 0
What am I doing wrong to achieve my desired result?