0

I was following this https://stackoverflow.com/a/3542115/3483997 approach in order to get 2 histograms (with different populations) integrated in the same data-frame.

ROS_SPITFIRE <- data.frame(length = rnorm(100, 0.76406353, 0.500970292)) ROS_FARSITE <- data.frame(length = rnorm(398, 3.48366834170854,2.19050069588744))

#Now, combine your two dataframes into one. First make a new column in each. ROS_SPITFIRE$veg <- 'ROS_SPITFIRE' ROS_FARSITE$veg <- 'ROS_FARSITE'

#and combine into your new data frame vegLengths vegLengths <- rbind(ROS_SPITFIRE, ROS_FARSITE)

#now make your lovely plot ggplot(vegLengths, aes(length, fill = veg)) + geom_density(alpha = 0.3) ggplot(vegLengths, aes(length, fill = veg)) + geom_density(alpha = 0.3)

ggplot(vegLengths, aes(length, fill = veg)) + geom_histogram(alpha = 0.5, aes(y = ..density..), position = 'identity') ggplot = ggplot + xlim((0,15))

My problem popped up when I´ve created the new column in each data-frame. It generates negative values, hence my final distribution plots have negative values on the X-axes. Does anyone know how to fix it?

Thx

Community
  • 1
  • 1
eFF
  • 267
  • 3
  • 17
  • You are creating your variable length from a normal distribution, which can there take on negative values. Thus your histogram will also contain these values – Marco Apr 01 '14 at 10:22
  • ahá, I guess I used it wrong since my data do not follow a normal distribution. is there anything equivalent to rnorm for NO normal cases? – eFF Apr 01 '14 at 10:40
  • try `sample(seq(.1, 1, by = .1), 100, replace = T)` You can also decide what the probability of each instance in the `prob = ` in `sample` – David Arenburg Apr 01 '14 at 10:48
  • You also don't have to supplement your dataframes - you could overlay the two using the data argument in geom_histogram so: `ggplot(ROS_SPITFIRE, aes(length, y = ..density..)) + geom_histogram(alpha = 0.5, fill = "spitfire", position = 'identity')+ geom_histogram(data=ROS_FARSITE, fill="farsite",alpha = 0.5, position = 'identity')` – Steph Locke Apr 01 '14 at 11:03
  • 1
    See also `?Distributions` for a list of distributions from which random numbers can be generated. – jbaums Apr 01 '14 at 11:04
  • Nice @jbaums, I wasn't familiar with this one – David Arenburg Apr 01 '14 at 11:19
  • Thank you guys for the answers! I tried your code @Steph Locke, but looks like R doesn't like it. I got "Error: ggplot2 doesn't know how to deal with data of class numeric" – eFF Apr 01 '14 at 12:38
  • hmm .. I had to change the fill to be inside an aes, but it works `ROS_SPITFIRE <- data.frame(length = rnorm(100, 0.76406353, 0.500970292)) ROS_FARSITE <- data.frame(length = rnorm(398, 3.48366834170854,2.19050069588744)) ggplot(ROS_SPITFIRE, aes(length, y = ..density..,fill="spitfire")) + geom_histogram(alpha = 0.5, position = 'identity')+ geom_histogram(data=ROS_FARSITE, aes(fill="farsite"),alpha = 0.5, position = 'identity')` http://imgur.com/XgXp4Fr – Steph Locke Apr 01 '14 at 14:04
  • awesome, thanks @Steph Locke. I am still struggling with the negative values. – eFF Apr 01 '14 at 14:19
  • If you need to stick with the specific `mean`s and `sd`s used but exclude any negative values you can filter the datasets e.g. `ROS_SPITFIRE[ROS_SPITFIRE$length>0,]` or do limits on the chart e.g. `xlim(0,12)`. If you can amend the distribution you can pick distributions or values that do not result in negative values – Steph Locke Apr 01 '14 at 14:33
  • It is a pity because I cannot vote you up guys. Thanks a lot – eFF Apr 01 '14 at 15:19

1 Answers1

0

If you need to stick with the specific means and sds used but exclude any negative values you can filter the datasets e.g. ROS_SPITFIRE[ROS_SPITFIRE$length>0,] or do limits on the chart e.g. xlim(0,12).

If you can amend the distribution you can pick distributions or values that do not result in negative values. @Dave and @jbaum provide guidance like using sample(seq(.1, 1, by = .1), 100, replace = T) or evaluating other distribution options for going down this route.

You can also cut out some steps by going straight to charting and provide a limit:

ggplot(ROS_SPITFIRE, aes(length, y = ..density..,fill="spitfire")) + 
geom_histogram(alpha = 0.5,  position = 'identity')+ 
geom_histogram(data=ROS_FARSITE, aes(fill="farsite"), 
                 alpha = 0.5, position = 'identity')+
xlim(0,12)

enter image description here

Steph Locke
  • 5,951
  • 4
  • 39
  • 77
  • Hello, I am here again. I was wrong since the begining. What I have to plot is basically 2 different datasets in only one histogram based on counts per pixel. The number of pixeles are different in each dataset. How can I plot a normal histogram in ggplot2 without density distribution involved? Many thanks before hand – eFF Apr 24 '14 at 16:37
  • You should be able to use geom_bar(). If you need more info, I would suggest a new SO question – Steph Locke Apr 24 '14 at 16:49
  • I think I could sort it out. For instance I´ve included NA to the shortest dataset. Afterwards just plotted histograms following the structure: qplot(V3, data=dsf, geom='histogram',xlab="m/min",main="ROS",fill=I('#FF9999'),alpha = 0.5)+ geom_histogram(aes(V8), data=dsf, fill='#56B3E9', alpha = 0.5). Thanks by the way! – eFF Apr 27 '14 at 06:16