1

year
1999 1999 1999 2003 2003 2005 2005 2005 2005 2007 2009 2009 2009

A1
15 7 24 6 65 5 89 56 21 15 19 7 23

Above table shows a data frame. I want to have a vector, lets say "median1" which has the median of those values in a1 corresponds to each year. And I know that with a for loop it is easy but I am trying to find a 'vectorized' based solution.

carl whyte
  • 103
  • 1
  • 6
  • Please, take your time and make an effort to edit your question. You can find [some alternatives here](http://stackoverflow.com/a/16657546/1315767) – Jilber Urbina Mar 28 '14 at 21:40

4 Answers4

1

with data.table package if your data.frame is called DF

library(data.table)
DT = data.table(DF)
DT[,median(a1),by='year']
statquant
  • 13,672
  • 21
  • 91
  • 162
1

Use ave which is an R base function. Combining ave with transform you'll get a pretty nice output. Consider dat is your data.frame

> transform(dat, Median= ave(a1, year, FUN=median))
  year a1 Median
1 1999 20   15.0
2 1999 15   15.0
3 1999 11   15.0
4 2003 11    7.0
5 2003  3    7.0
6 2007 89   40.5
7 2007 25   40.5
8 2007 56   40.5
9 2007 12   40.5

If you only want a vector consisting of medians per each year you can do:

> with(dat, ave(a1, year, FUN=median))
[1] 15.0 15.0 15.0  7.0  7.0 40.5 40.5 40.5 40.5
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
1

In base R, you can do this:

foo <- data.frame(
  year=c(1999,1999,1999,2003,2003,2005,2005,2005,2005,2007,2009,2009,2009),
  A1=c(15,7,24,6,65,5,89,56,21,15,19,7,23))
by(foo$A1,foo$year,median)

Strictly speaking, the result will not be a vector, but you can fix that:

as.vector(by(foo$A1,foo$year,median))

by() is always helpful when you want to do an operation by groups.

Stephan Kolassa
  • 7,953
  • 2
  • 28
  • 48
0

It's not clear to me, but it seems like you want the median of each year? If so...

## set up the data
> year <- c(1999,1999,1999,2003,2003,2005,2005,2005,2005,2007,2009,2009,2009)
> A1 <- c(15, 7, 24, 6, 65, 5, 89, 56, 21, 15, 19, 7, 23)
> dd <- data.frame(year, A1)

## solution
> xx <- c(do.call(cbind, lapply(split(dd, dd$year), function(x) median(x$A1))))
> names(xx) <- unique(dd$year)
> xx
1999 2003 2005 2007 2009 
15.0 35.5 38.5 15.0 19.0 
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245