Facing an Error while creating a new column

Question

Original dataframe :

df.head()
>
     beer_beerid    review_profilename    review_overall
0     48215          stcules                   3.0
1     52159          oline73                   3.0
2     52159          alpinebryant              3.0
3     52159          rawthar                   4.0
4     52159          RangerClegg               3.5

Need to create a new column with number of occurrence of beer_beerid in this dataframe. If beerid 52159 occurs 4 times - then the new column value for that beerid should be 4.

Used the below Code :

df['beer_review_count'] = df.groupby('beer_beerid').transform('count')

It gives the following error

ValueError: Wrong number of items passed 2, placement implies 1

what output you want in the resulting dataframe regarding 'beer_review_count' column, i mean what each cell contain in this row — shubham, Jul 18 '19 at 06:08
check the output , number of rows might not be same....or datatype - it should be list or series — Patel, Jul 18 '19 at 06:15

score 0 · Accepted Answer · answered Jul 18 '19 at 06:28

Here's the solution.

df['beer_review_count'] = df.groupby('beer_beerid')['beer_beerid'].transform('count')

It works fine by using transform()

beer_beerid   profilename  overall  beer_review_count
0  48215       stcules      3.0                  1
1  52159       oline73      3.0                  4
2  52159  alpinebryant      3.0                  4
3  52159       rawthar      4.0                  4
4  52159   RangerClegg      3.5                  4

Anshuman Kumar · Answer 2 · 2019-07-18T07:12:34.670

-1

Assuming the schema in the edit is the correct one, Try

df['beer_beerid'].value_counts()

Share a picture of the CSV file, just so that I can be sure as I right now I am not sure if it is beerid or _beerid.

LINK

df.groupby('beer_beerid')['beer_beerid'].count()

EDIT:

Possible fix to NaN error and unlike the other solution, this should avoid redundancies: the repeating of values.

LINK

The output according to what I did should be this.


beer_beerid
48215       1 
52159       4

This helps to add the column to the existing dataframe.

edited Jul 18 '19 at 07:12

answered Jul 18 '19 at 06:16

Anshuman Kumar

464
1
6
20

its "beer_beerid" – Bindhu Balu Jul 18 '19 at 06:23

Facing an Error while creating a new column

2 Answers2