I have a dataframe with 2 columns.
df=pd.DataFrame({'values':arrays,'ii':lin_index})
I want to group the values by the lin_index and get the mean per group and the most common value per group I try this
bii=df.groupby('ii').median()
bii2=df.groupby('ii').agg(lambda x:x.value_counts().index[0])
bii3=df.groupby('ii')['values'].agg(pd.Series.mode)
I wonder if bii2 and bii3 return the same values Then I want to return the mean and most common value to the original array
bs=np.zeros((np.unique(array).shape[0],1))
bs[bii.index.values]=bii.values
Does this look good?
df looks like
values ii
0 1.0 10446786
1 1.0 11316289
2 1.0 16416704
3 1.0 12151686
4 1.0 30312736
... ...
93071038 3.0 28539525
93071039 3.0 19667948
93071040 3.0 22240849
93071041 3.0 22212513
93071042 3.0 41641943
[93071043 rows x 2 columns]