1

I have multiple columns of data, let's say x and time. I want to make a histogram of column x, and color each bar based off an aggregation of the values in column time, where the aggregation is grouped by the breaks used for the histogram. So,

d = cbind(c(rep(1,3), rep(2,3)), c(10,20,10,20,10,20))
names(d) = c("x", "time")
hist(d[,"x"])

Gives me a nice barplot, and let's say I want something like this for my colors:

palette(rainbow(25))
hist(d[,"x"], col=d[,"time"], n=10)

I would like to have the col be a vector of length 10 that is an aggregated function (such as mean) of the time column.

Andy
  • 4,549
  • 31
  • 26
Hamy
  • 20,662
  • 15
  • 74
  • 102
  • your code does not run as is. Perhaps you wanted `data.frame(x=..., time=...)` rather than `cbind(...)`? It errors on the plotting step because you've created an array with `cbind` and named two of the twelve entries in it. – Justin Aug 08 '12 at 20:45

2 Answers2

1

I would do this with plyr and ggplot2:

require(plyr)
require(ggplot2)

d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T))
d <- ddply(d, .(x), transform, mean.time=mean(time))

ggplot(d, aes(x=x, group=x, fill=mean.time)) +
  geom_histogram()

enter image description here

Andy
  • 4,549
  • 31
  • 26
0

If I correctly understood, you would like to average time values over each x and plot a histogram. But which colour do you want to use? Gradient or individual, based on mean time values or on x values?

Consider this example as a starting point

require(ggplot2)
d <- data.frame(x=c(rep(1:4, each=4)), time=sample(10:100, 16, replace=T)) # thanks to Andy :)
ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=factor(d$x)))

or

ggplot(d, aes(x=factor(x), y=time)) + 
stat_summary(fun.y="mean", geom="bar", aes(fill=d$x))
DrDom
  • 4,033
  • 1
  • 21
  • 23