1

I have a "emp" dataset with name, grade and value(based on manager's feedback)

  name grade value
1  Ram     R   2.1
2  Sam     R   2.4
3  Jam     R   5.3
4 Bill     S   4.2
5 Claw     S   3.6
6  Men     S   1.2
7  Jay     P   5.3
8  Kay     P   3.8
9  Ray     P   3.2

With aggregate(value ~ grade, data = emp, FUN=min) I got the minimum value for each grade

  grade value
1     P   3.2
2     R   2.1
3     S   1.2

Based on the minimum value I wanted to display only the grade and the name but not the value column. Is this possible with aggregate() in R.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
srivatsa
  • 13
  • 2
  • http://stackoverflow.com/questions/6289538/aggregate-a-dataframe-on-a-given-column-and-display-another-column – germcd Apr 30 '15 at 16:03

1 Answers1

2

Here's a possible approach

library(data.table)
setDT(emp)[, .(name = name[which.min(value)]), by = grade]
#    grade name
# 1:     R  Ram
# 2:     S  Men
# 3:     P  Ray

Here's another

library(dplyr)
emp %>%
  group_by(grade) %>%
  summarise(name = name[which.min(value)])

# Source: local data table [3 x 2]
# 
#   grade name
# 1     R  Ram
# 2     S  Men
# 3     P  Ray

Or with base R

do.call(rbind, by(emp, emp$grade, 
                  function(x) data.frame(grade = as.character(x$grade[1L]), 
                                         name = x$name[which.min(x$value)])))
#   grade name
# P     P  Ray
# R     R  Ram
# S     S  Men
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • Your effort is appreciated, but my questions was is it not possible to use aggregate() function for doing the same? – srivatsa Apr 30 '15 at 15:17
  • I don't think so. Why would you insist on `aggregate`? – David Arenburg Apr 30 '15 at 15:20
  • It's the wrong tool. `aggregate`'s functions only "see" a single vector at a time, so they don't have the necessary data to work on two columns. The base paradigm would be `lapply(split(,), FUN)`. – IRTFM Apr 30 '15 at 15:23