I'm currently working on big chunk of data on clustering each of RFM_class. The rfm class have 125 distinct values ranging from 111
until 555
, the total rows of my dataframe currently sampled into 10000
rows for trial purposes of the script.
The logic behind what i'm trying to do is, take each of the RFM_class (125 distinct values), and do clustering
method for each subset of the RFM_class by looping them for each RFM_class to get the cluster_class
column with an empty dataframe, and then append the value again to the empty dataframe. And the empty dataframe will be merged to my main table.
This is the snapshot of the main table, I shrinked into 4 columns only, the origin is 11 columns.
df_test
RFM_class customer_id num_orders recent_day amount_order
555 1 1489 0 18539000
555 2 72 3 1069000
145 3 13 591 1350000
555 4 208 0 2119000
445 5 40 9 698000
What i'm doing is not far enough until the clustering, so i'm really stuck in looping each of the RFM_class
This is what i'm trying to do for the last couple of days , trying only to take each RFM_class
rfm_list = list(set(df_test['rfm']))
core_col = ['num_orders','recent_day','amount_order']
cl_class = []
for row in rfm_list:
a=pd.DataFrame(df_test[core_col][df_test.rfm==row],columns=core_col)
cl_class.append(a)
cl_class
but the result is not as expected, because doing append
seems not adding a new rows inside my empty dataframe.
Are there any function to do this on pandas? currently using python 3.0