0

I have a df like this:

df <- data.frame("Part" = c("y","z","y","z","x","y"), "Prod" = c("a","d","e","d","t","a"))
> df
    Part Prod
1    y    a
2    z    d
3    y    e
4    z    d
5    x    t
6    y    a

I want an output like this:

df2
   Part Prod
1    x    t
2    y    a
3    y    e
4    z    d

Actually I want a result that summarizes results without duplication in "Part" column. How to do this is R. Thanks

DaveM
  • 147
  • 9
  • 1
    In your expected output `y` is duplicated in `Part`. Do you need `aggregate(Part~Prod, df, head, 1)` ? – Ronak Shah Apr 01 '20 at 02:23
  • Thanks Ronak it worked...Would this work for two columns or I can use it for more than two column, Suppose I have "Color" as third column, how would I aggregate that in a single df. Sorry I am a beginner in R. Thanks – DaveM Apr 01 '20 at 02:28
  • I added an answer. It is easy to do it in `dplyr` if you have multiple columns by specifying all the columns in vars . If you have many such columns, you can specify them by position `summarise_at(1:5` or by range of columns `summarise_at(vars(Part:Color)` – Ronak Shah Apr 01 '20 at 02:36

1 Answers1

1

We can use aggregate

aggregate(cbind(Part, Color)~Prod, df, head, 1)

Or using dplyr

library(dplyr)
df %>% group_by(Prod) %>% summarise_at(vars(Part, Color), first)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213