I'm new to Pandas and am trying to do some basic data transformation exercise. One method I tried to use is groupby
, but I fail to understand the output I am seeing.
df = pd.DataFrame({'row': range(10), 'time': range(10), 'machine': ['M1', 'M2', 'M3', 'M1', 'M2', 'M3', 'M1', 'M2', 'M3', 'M1'], 'value1': range(10), 'value2': range(10)})
def func(g):
print '----', type(g)
return 42
print df.groupby('machine', axis=0).apply(func)
Why is this printing the print statement in the function 4 times? The way I would have thought it works is to group df
into 3 dataframes (for each machine) and apply func
on each of those grouped dataframes. But this is not what I observe...
The complete output:
---- <class 'pandas.core.frame.DataFrame'>
---- <class 'pandas.core.frame.DataFrame'>
---- <class 'pandas.core.frame.DataFrame'>
---- <class 'pandas.core.frame.DataFrame'>
machine
M1 42
M2 42
M3 42
dtype: int64
Update
I just found this duplicate.