2

I feel like this is very obvious mistake on my part, but for some reason, the solution is escaping me right now.

So, first, create the following dataframe

import pandas as pd
A = {'device': 1, 'component': 100, 'value': 10}
B = {'device': 1, 'component': 100, 'value': 20}
C = {'device': 1, 'component': 101, 'value': 20}
D = {'device': 2, 'component': 100, 'value': 30}
E = {'device': 2, 'component': 100, 'value': 31}
F = pd.DataFrame([A, B, C, D, E])

Now, I want to apply a method to unique groups of "device" and "component". For simplicity, assume I simply want to get the shape of the dataframe

def simple(df):
  print(df.shape)

Now, apply it to per grouping

F.groupby(['device', 'component']).apply(simple)

The output is

(2, 3)
(2, 3)
(1, 3)
(2, 3)

Which is what you would expect, but let's change this now

A = {'device': 1, 'component': 100, 'value': 10}
B = {'device': 1, 'component': 100, 'value': 20}
C = {'device': 1, 'component': 100, 'value': 20}
F_new = pd.DataFrame([A, B, C])
F_new.groupby(['device', 'component']).apply(simple)

This gives

(3, 3)
(3, 3)

Why does it not give me (3, 3) once and only once?

Dammi
  • 1,268
  • 2
  • 13
  • 23
  • 1
    You didn't notice the extra `(2, 3)` on your first test, how was that expected? – Martijn Pieters Mar 06 '19 at 14:15
  • 1
    If you change the simple function to print(df) you will notice there is an extra there too. – Christian Sloper Mar 06 '19 at 14:16
  • 1
    Wow, indeed I knew I was having a major case of tunnel vision. Didn't even notice the duplicated line in my example! Thanks for the duplicate link @jezrael , couldn't find it on my own! :) – Dammi Mar 06 '19 at 14:17

0 Answers0