I need to do something very similar to this question: Pandas convert dataframe to array of tuples
The difference is I need to get not only a single list of tuples for the entire DataFrame, but a list of lists of tuples, sliced based on some column value.
Supposing this is my data set:
t_id A B
----- ---- -----
0 AAAA 1 2.0
1 AAAA 3 4.0
2 AAAA 5 6.0
3 BBBB 7 8.0
4 BBBB 9 10.0
...
I want to produce as output:
[[(1,2.0), (3,4.0), (5,6.0)],[(7,8.0), (9,10.0)]]
That is, one list for 'AAAA', another for 'BBBB' and so on.
I've tried with two nested for loops. It seems to work, but it is taking too long (actual data set has ~1M rows):
result = []
for t in df['t_id'].unique():
tuple_list= []
for x in df[df['t_id' == t]].iterrows():
row = x[1][['A', 'B']]
tuple_list.append(tuple(x))
result.append(tuple_list)
Is there a faster way to do it?