How can i find most frequent items in pandas dataframe?

Question

I am trying to find what products are sold more in a specific season but I am finding difficulties. I have created a Season attribute and have found what season each of products have been sold and I have taken as an example Season 1 (Winter), the same thing I will do for all other seasons, then i have to show in plots what products are sold mostly in each of the seasons.

Here is a subset of data(StockCode, Description, month, Season):

22460,EMBOSSED GLASS TEALIGHT HOLDER,12,1
84832,ZINC WILLIE WINKIE  CANDLE STICK,12,1
23084,RABBIT NIGHT LIGHT,12,1
84879,ASSORTED COLOUR BIRD ORNAMENT,12,1
84945,MULTI COLOUR SILVER T-LIGHT HOLDER,12,1
22113,GREY HEART HOT WATER BOTTLE,12,1
23356,LOVE HOT WATER BOTTLE,12,1
22726,ALARM CLOCK BAKELIKE GREEN,12,1
22727,ALARM CLOCK BAKELIKE RED ,12,1
16016,LARGE CHINESE STYLE SCISSOR,12,1
21916,SET 12 RETRO WHITE CHALK STICKS,12,1
84692,BOX OF 24 COCKTAIL PARASOLS,12,1
84946,ANTIQUE SILVER T-LIGHT GLASS,12,1
21684,SMALL MEDINA STAMPED METAL BOWL ,12,1
22398,MAGNETS PACK OF 4 SWALLOWS,12,1
23328,SET 6 SCHOOL MILK BOTTLES IN CRATE,12,1
23145,ZINC T-LIGHT HOLDER STAR LARGE,12,1
22466,FAIRY TALE COTTAGE NIGHT LIGHT,12,1
22061,LARGE CAKE STAND  HANGING STRAWBERY,12,1
23275,SET OF 3 HANGING OWLS OLLIE BEAK,12,1
21217,RED RETROSPOT ROUND CAKE TINS,12,1

My pandas dataframe looks like Pandas Dataframe

I trying to get the following dataframe where a new attribute is created counting how many times an item is purchased in ascending order. Required

I have tried the following codes, but not succeeded.

df_top_freq = data1.groupby(['Description'])['StockCode'].agg(
    {"code_count": len}).sort_values("code_count", ascending=False).head(n).reset_index()


---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-32-8d5e96d38ae0> in <module>
----> 1 df_top_freq = data1.groupby(['Description'])['StockCode'].agg(
      2     {"code_count": len}).sort_values("code_count", ascending=False).head(n).reset_index()

AttributeError: 'NoneType' object has no attribute 'groupby'

and also

count = data1['StockCode'].value_counts() 
print(count)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-958a6e8a501c> in <module>
----> 1 count = data1['StockCode'].value_counts()
      2 print(count)

TypeError: 'NoneType' object is not subscriptable

Can anyone help me please?

[Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) , please post the data as text — anky, Jun 08 '19 at 16:44
i just posted a subset of data, because the all dataset consists of 392692 rows — hendi, Jun 08 '19 at 17:01

score 0 · Answer 1 · answered Jun 08 '19 at 17:26

Based on the error "AttributeError: 'NoneType' object has no attribute 'groupby'", your variable data1 doesn't seem to be a data frame, instead, it is a NoneType variable, because of which you are not able to call the groupby function.

Check the value of data1 and try to repopulate the data for the variable and try again.

Also using a dictionary for renaming is depreciated, you may want to check the below link for alternate options.

Rename result columns from Pandas aggregation ("FutureWarning: using a dict with renaming is deprecated")

score 0 · Accepted Answer · answered Jun 08 '19 at 17:34

0

This seems to be working :)

df_top_freq = data1.groupby("StockCode")['StockCode'].agg(
  {"code_count": len}).sort_values("code_count", ascending=False).head(n).reset_index()

I have grouped rather by the Stock code (not by the Description).

answered Jun 08 '19 at 17:34

Mikulas

407
4
5

How can i find most frequent items in pandas dataframe?

2 Answers2