considering the following dataset:
df = pd.DataFrame(data=np.array([['a',1, 2, 3,'T'], ['b',4, 5, 6,'T'],
['b',9, 9, 39,'T'],
['c',16, 17 , 18,'N']])
, columns=['id','A', 'B', 'C','Active'])
id A B C Active
a 1 2 3 T
b 4 5 6 T
b 9 9 39 T
c 16 17 18 N
I need to augment each rows of each groups(id) by rows that the active = T , which means that
a 1 2 3 a 1 2 3
b 4 5 6 a 1 2 3
b 9 9 39 a 1 2 3
a 1 2 3 b 4 5 6
b 4 5 6 b 4 5 6
b 9 9 39 b 4 5 6
a 1 2 3 b 9 9 39
b 4 5 6 b 9 9 39
b 9 9 39 b 9 9 39
a 1 2 3 c 16 17 18
b 9 9 39 c 16 17 18
b 4 5 6 c 16 17 18
I have an idea which I could not implement it. first, make a new dataset by filtering data. take all rows that active column is equal to T and save it in a new df.
df_t = df [df['Active']=='T']
then for each rows of df add a new vector form df_t dataset. which means that :
for sample in df:
for t in df_t:
df_new = sample + t ( vectors of df and df_t join together)
Df_new = concat(df_new,Df_new)
I really appreciate your comments and suggestion to implement my own idea!