pandas newbie question:
I have a dataframe with millions of rows, a sample output would be:
c_id c1 c2
0 10 100
0 15 110
0 15 112
2 96 120
56 43 42
for each customer_id, i want to create a table do some stuff to it. What's the best way to do it? I sorted the dataframe by c_id, then set the index to it:
df = df.sort('c_id', ascending=False)
df = df.set_index('c_id')
but a simple operation like:
temp_df = df.loc[:0]
takes forever, what's the fastest way to approach this problem? I thought a sorted set_index would do the trick. I guess not.
EDIT1:
I want to get the list of all the unique values of c1, for each value of c_id. so something like:
df.loc[:0].c1.unique()