0

Original dataframe :

df.head()
>
     beer_beerid    review_profilename    review_overall
0     48215          stcules                   3.0
1     52159          oline73                   3.0
2     52159          alpinebryant              3.0
3     52159          rawthar                   4.0
4     52159          RangerClegg               3.5

Need to create a new column with number of occurrence of beer_beerid in this dataframe. If beerid 52159 occurs 4 times - then the new column value for that beerid should be 4.

Used the below Code :

df['beer_review_count'] = df.groupby('beer_beerid').transform('count')

It gives the following error

ValueError: Wrong number of items passed 2, placement implies 1
Anshuman Kumar
  • 464
  • 1
  • 6
  • 20
  • what output you want in the resulting dataframe regarding 'beer_review_count' column, i mean what each cell contain in this row – shubham Jul 18 '19 at 06:08
  • check the output , number of rows might not be same....or datatype - it should be list or series – Patel Jul 18 '19 at 06:15

2 Answers2

0

Here's the solution.

df['beer_review_count'] = df.groupby('beer_beerid')['beer_beerid'].transform('count')

It works fine by using transform()

beer_beerid   profilename  overall  beer_review_count
0  48215       stcules      3.0                  1
1  52159       oline73      3.0                  4
2  52159  alpinebryant      3.0                  4
3  52159       rawthar      4.0                  4
4  52159   RangerClegg      3.5                  4
N. Arunoprayoch
  • 922
  • 12
  • 20
-1

Assuming the schema in the edit is the correct one, Try

df['beer_beerid'].value_counts()

Share a picture of the CSV file, just so that I can be sure as I right now I am not sure if it is beerid or _beerid.

LINK

df.groupby('beer_beerid')['beer_beerid'].count()

EDIT:

Possible fix to NaN error and unlike the other solution, this should avoid redundancies: the repeating of values.

LINK

The output according to what I did should be this.


beer_beerid
48215       1 
52159       4

This helps to add the column to the existing dataframe.

Anshuman Kumar
  • 464
  • 1
  • 6
  • 20