I have a sample dataset:
import pandas as pd
df = {'ID': ['H1','H2','H3','H4','H5','H6'],
'AA1': ['C','B','B','X','G','G'],
'AA2': ['W','K','K','A','B','B'],
'name':['n1','n2','n3','n4','n5','n6']
}
df = pd.DataFrame(df)
it looks like :
df
Out[32]:
AA1 AA2 ID name
0 C W H1 n1
1 B K H2 n2
2 B K H3 n3
3 X A H4 n4
4 G B H5 n5
5 G B H6 n6
I want to groupby AA1 and AA2 (unique AA1 and AA2 pair) and it doesn't matter which ID and name values the unique pair picks along with it, and output that to a .csv file, so the output in the .csv file would look like:
AA1 AA2 ID name
C W H1 n1
B K H2 n2
X A H4 n4
G B H5 n5
i tried the code:
df.groupby('AA1','AA2').apply(to_csv('merged.txt', sep = '\t', index=False))
but the to_csv was not recognized, what can i put in the .apply() to just output the groupby results to a csv file?