0

How can I output a groupby object to a csv file? My original input tsv file has the following format:

id     score    domain
1       5         x
2       3         x
1       4         y
2       2         y

I need the output tsv file to be grouped by id and sorted by score (descending order) so it will look like this:

id      score     domain
1        5          x
1        4          y
2        3          x
2        2          y

Any suggestions how should I do that? I tried some groupby and sort_values functions using pandas but it did not produce the required output for me. Thanks!

niraj
  • 17,498
  • 4
  • 33
  • 48
yuhengd
  • 343
  • 3
  • 10
  • 3
    `df=df.sort_values(['id','score'],ascending=[True,False])` – BENY Jun 06 '18 at 20:05
  • How can I limit to only the highest score entry for each id then? like id score domain 1 5 x 2 3 x – yuhengd Jun 06 '18 at 20:19
  • 1
    You can take advantage of the fact that it's already sorted and then just group and call head. After Wen's solution just do `df.groupby('id').head(1)` – ALollz Jun 06 '18 at 20:26
  • 1
    @yuhengd after sort, drop_duplicate keep= first – BENY Jun 06 '18 at 20:29

0 Answers0