Getting median values of an unknown number of unique rows in R

Question

thanks in advance for the help. I am working with a series of .csv files that contain data in the following format:

ID<-c(1,1,1,1,2,2,3,3,3,4,4,4,4,5,5,6,7,7)
Length<-c(3,3,4,7,6,4,7,8,8,9,3,2,4,3,6,8,5,3)
dummydata<-cbind(ID,Length)

dummydata<-cbind(ID,Length)

> dummydata
      ID Length
 [1,]  1      3
 [2,]  1      3
 [3,]  1      4
 [4,]  1      7
 [5,]  2      6
 [6,]  2      4
 [7,]  3      7
 [8,]  3      8
 [9,]  3      8
[10,]  4      9
[11,]  4      3
[12,]  4      2
[13,]  4      4
[14,]  5      3
[15,]  5      6
[16,]  6      8
[17,]  7      5
[18,]  7      3

What I need to do is find the median Length of each unique number (1,2,3, etc). I can do this individually by using the following code:

one<-median(dummydata[dummydata$ID=="1","Length"])
two<-median(dummydata[dummydata$ID=="2","Length"])
three<-median(dummydata[dummydata$ID=="3","Length"])

However, in every .csv file, there are thousands of ID's, and creating the above code for each number is not feasible. Is there a way for me to find the median Length of each unique ID number for the entire thousands long data set? Ideally I would be able to create a new column with these medians.

I would appreciate any insight into this issue!

This is a straight forward split-apply-combine problem. See http://stackoverflow.com/q/11562656/892313 but use median rather than mean. — Brian Diggs, Mar 04 '14 at 00:15

score 3 · Accepted Answer · edited Mar 04 '14 at 00:14

3

have a look at tapply.

for example:

with(as.data.frame(dummydata), tapply(Length,list(ID),median))
#   1   2   3   4   5   6   7 
# 3.5 5.0 8.0 3.5 4.5 8.0 4.0

edited Mar 04 '14 at 00:14

flodel

87,577
21
185
223

answered Mar 04 '14 at 00:12

amit

3,332
6
24
32

score 2 · Answer 2 · edited Mar 04 '14 at 00:20

2

A dplyr solution:

library(dplyr)

as.data.frame(dummydata) %.% group_by(ID) %.% summarise(Median = median(Length))

edited Mar 04 '14 at 00:20

flodel

87,577
21
185
223

answered Mar 04 '14 at 00:19

Hugh

15,521
12
57
100

Getting median values of an unknown number of unique rows in R

2 Answers2