I started off wanting to turn a column from a pandas dataframe into a list, and then get the unique values, with the aim of iterating over those unique values in a for loop, and creating a few smaller dataframes. I.e. one for each cluster. Then I want to store these smaller dataframes in a dictionary object.
@ben suggested I start a new question and ask about the GroupBy Method of pandas dataframes to perform this task?
My original post is over here: get list from pandas dataframe column
My Data:
cluster load_date budget actual fixed_price
A 1/1/2014 1000 4000 Y
A 2/1/2014 12000 10000 Y
A 3/1/2014 36000 2000 Y
B 4/1/2014 15000 10000 N
B 4/1/2014 12000 11500 N
B 4/1/2014 90000 11000 N
C 7/1/2014 22000 18000 N
C 8/1/2014 30000 28960 N
C 9/1/2014 53000 51200 N
For example: for item in cluster_list(where cluster list is the unique set of values in cluster)
create a dataframe for cluster a, where budget > X etc
Then do the same for the other clusters, and put them in a dictionary.
Then be able to get a certain dataframe out of the dictionary, say only the dataframe for cluster B where budget > X
GetDf(key):
return dict(key)
Thanks in advance