0

I wish to create a density plot on qplot() comprising three response variables. So the graph will be Density (y) against Elevation (x), with three colour-coded density functions showing how densities of each change as Elevation (x-axis) changes.

First I subsetted the three response variables (3 columns in my dataset "CAIRNGORM") into a small subset called "ZONES":

ZONES<-CAIRNGORM[c("prop_Cal", "prop_Emp", "prop_Jun")]

Then I tried to create the qplot:

library(ggplot2)
qplot(Elevation, data=CAIRNGORM, geom="density", fill="ZONES", alpha=I(0.5))

which creates a plot but instead of giving me three traces, one for each of prop_Cal, prop_Emp and prop_Jun, I just have one trace and it appears to show the density of my Elevation data - a straight line!

I would really appreciate somebody's help with this - how do I instruct qplot to build three traces contained within "ZONES" instead of the x-variable? Thanks

Edit: Shortened version of my data (trying to put correct code formatting in Stack Overflow):

> head(CAIRNGORM)
  position group Elevation 
1       Q1     A       680   
2       Q2     A       730   
3       Q3     A       780  
4       Q4     A       830     
5       Q5     A       880      
6       Q6     A       930   
  prop_bar prop_Cal prop_Vac prop_Emp prop_Jun prop_Ces prop_Eri ZONES.prop_Cal
1     0.00     1.00      0.0        0        0     0.36      0.4           1.00
2     0.00     1.00      0.0        0        0     0.28      0.0           1.00
3     0.00     0.84      0.6        0        0     0.48      0.0           0.84
4     0.00     1.00      0.0        0        0     0.00      0.0           1.00
5     0.24     0.76      0.0        0        0     0.72      0.0           0.76
6     0.36     0.72      0.0        0        0     0.00      0.0           0.72
  ZONES.prop_Emp ZONES.prop_Jun
1              0              0
2              0              0
3              0              0
4              0              0
5              0              0
6              0              0

> head(ZONES)
  prop_Cal prop_Emp prop_Jun
1     1.00        0        0
2     1.00        0        0
3     0.84        0        0
4     1.00        0        0
5     0.76        0        0
6     0.72        0        0
talat
  • 68,970
  • 21
  • 126
  • 157
  • Welcome to SO! It would be most helpful if you could take some time to provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), especially for a question that involves plotting. – talat Dec 21 '14 at 21:28
  • It seems that you are working with several data frames in your question. Please provide a small sample of each in your question – talat Dec 21 '14 at 21:41
  • Hi - thanks for editing it - I think I've worked out how to edit into code format - you put four spaces before each line? Anyway, about my data, I'm just using just one main dataset, CAIRNGORM, and created a subset called ZONES from the three important variables within CAIRNGORM in the hope that it would allow the qplot to work, but alas it has not. Apologies coding and R are not my forte. I'm sure it's something simple I've done wrong. – Martin Christophe Ross Dec 21 '14 at 22:07
  • It looks like you want a density plot of the "response variables" (meaning the columns in `ZONES`?) against `Elevation`. But a (one-dimensional) density plot is just a smoothed histogram of the distribution of values in a single variable, rather than the values of one variable relative to another. Can you say more about what you expect the plot to look like? I've posted a potential answer below, but please let me know if that's not what you had in mind and I'll update my answer. – eipi10 Dec 22 '14 at 07:07

1 Answers1

1

ggplot2 prefers data in "long" rather than "wide" format. Here's how to get the three density plots in a single graph. Because two of the columns in your sample data are all zeros, I've created some fake data for illustration:

library(reshape2) # For the melt function

# Fake data
ZONES = data.frame(prop_Cal=rnorm(100), 
                   prop_Emp=rnorm(100,-10,3), 
                   prop_Jun=rnorm(100,10,0.5))

# Melt into long format (take a look at the melted data frame to see what melt is doing)
ZONES.M <- melt(ZONES)

ggplot(ZONES.M, aes(value, fill=variable)) +
     geom_density(alpha=0.5)

variable contains the names of each column in your original wide-format data frame. value contains the values. Setting the fill aesthetic to variable tells ggplot to create a separate density plot for each level of variable.

enter image description here

You can't plot the density of prop_Cal or the other two variables against elevation. A (1-dimensional) density plot of a variable is inherently about a single variable. If you're trying to find a relationship between elevation and the other three variables, maybe you want a violin plot. For example:

# Fake data with Elevation added
ZONES = data.frame(Elevation=rep(c(10,20,30,40),each=25), 
                   prop_Cal=rnorm(100), 
                   prop_Emp=rnorm(100,-10,3), 
                   prop_Jun=rnorm(100,10,10))

# Melt into long format, this time with Elevation as an id variable
ZONES.M <- melt(ZONES, id.var="Elevation")

ggplot(ZONES.M, aes(Elevation, value, group=Elevation)) +
  geom_violin() +
  facet_grid(. ~ variable)

Now we have a density plot for each value of Elevation, separately for each of your original three column variables. (You can also combine several values of elevation first, using the cut function, if you want to group by elevation bands.)

enter image description here

If instead you want a scatterplot of each variable vs. Elevation, you can do this:

ggplot(ZONES.M, aes(Elevation, value, group=Elevation)) +
  geom_point(colour="black", fill="lightblue", alpha=0.5, pch=21) +
  facet_grid(. ~ variable)

If you want to add a regression line (which may be what you're actually looking for if you want to summarise the relationship between Elevation and the other three variables), do this:

ggplot(ZONES.M, aes(Elevation, value, group=Elevation)) +
  geom_point(colour="black", fill="lightblue", alpha=0.5, pch=21) +
  geom_smooth(aes(group=1)) +  
  facet_grid(. ~ variable)
eipi10
  • 91,525
  • 24
  • 209
  • 285
  • Wow - thanks! However, I wish to plot Elevation along the x-axis. I tried replacing "value" with "dataset$Elevation" and it did not work. My variable data values correspond to a particular Elevation value and I'd like to represent the densities of each variable at changing Elevation. I don't know if this is possible to do. – Martin Christophe Ross Dec 22 '14 at 07:19