0

I want a pandas dataframe to group by a column named 'ad_id', each entry in this column is an integer, for example, 7, 12, 120003, 12, 7. When I groupby, python seems messed up with the integer, instead of showing me 7,12, 120003, it just showed me 1,2,3. Why does that happend? Initial dataframe: enter image description here Here's my code:

ads_df = pd.read_csv('clicks.csv')
each_ad_counts= ads_df.groupby('ad_id').size()
each_ad_click_counts= ads_df.groupby('ad_id')['clicked'].sum()

After groupby:

enter image description here

But in the original dataframe, there is not ad_id as 1 or 2 or 3

Candice Zhang
  • 211
  • 1
  • 3
  • 10
  • Please provide a [mcve](http://stackoverflow.com/help/mcve). (Also, there is a convention on stack overflow to not include greetings such as "Thanks!') – Julien Marrec Nov 30 '16 at 22:55
  • Just edited my question – Candice Zhang Nov 30 '16 at 23:04
  • Try adding `.reset_index()` to the end of your second line. Does this help? – Alex Nov 30 '16 at 23:14
  • Anyone wanting to help would probably need at least sample data to try and replicate your problem. – semore_1267 Nov 30 '16 at 23:15
  • Pictures aren't great, we need something we can use (either something we can read through `pd.read_clipboard` or a sample of code that produces a dataframe). Please read [How to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – Julien Marrec Nov 30 '16 at 23:44
  • 1
    Does the following work for you? `df = pd.DataFrame({'ad_id': [7,12,120003,7,35,35,35,35,7], 'clicked':[0,1,1,1,0,0,1,1,1]})` `each_ad_click_counts = df.groupby('ad_id')['clicked'].sum()` This yields the appropriate result for me with the correct ids. – 3novak Dec 01 '16 at 02:09

0 Answers0