0

I am trying to plot a multi-element variation diagram and therefore have a discrete x axis and continuous y axis. I have two problems:
- although I have successfully used similar code before, the geom_path command is not plotting anything
- the geom_point function is plotting the wrong data. The first point should be 212 but is plotting <10

I have tried two different layouts of the data and code which I have included below but each results in the same image (attached).

Any help would be very much appreciated.
Holly

Data set sample 1:

Chond_normalised <- structure(list(Element = c("Th", "Nb", "La", "Ce", "Pr", "Nd", "Sm", "Zr", "Eu", "Ti", "Gd", "Tb", "Dy", "Th", "Nb", "La", "Ce", "Pr", "Nd", "Sm", "Zr", "Eu", "Ti", "Gd", "Tb", "Dy", "Th", "Nb", "La", "Ce", "Pr", "Nd", "Sm", "Zr", "Eu", "Ti", "Gd", "Tb", "Dy"), ppm = c(212, 65, 73, 49, 38, 26, 12, 25, 6, 6, 7, 6, 5, 8, 10, 122, 95, 73, 55, 26, 4, 17, 1, 14, 9, 7, 41, 46, 74, 57, 49, 43, 28, 19, 20, 6, 18, 13, 9), Sample = c("a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c")), .Names = c("Element", "ppm", "Sample"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L), class = "data.frame")
Chond_normalised

Code 1:

library(ggplot2)
Chond_normalised <- read.csv('filename.csv', header=TRUE, sep=",")
attach(Chond_normalised)
ggplot(data=Chond_normalised, aes(ymin=0.1, ymax=1000, x=Element, y=ppm)) +
geom_path(data=Chond_normalised, aes(y=ppm[Sample=="a"], x=Element[Sample=="a"]), colour="black", size=1.0) +
geom_point(data=Chond_normalised, aes(y=ppm[Sample=="a"], x=Element[Sample=="a"]), colour="red", size=1.0) +
scale_y_log10("Sample / Chondrite", breaks=c(0.1, 1, 10, 100, 1000)) +
scale_x_discrete("", labels=Element)

Data set sample 2:

Chond_normal <- structure(list(Element = c("Th", "Nb", "La", "Ce", "Pr", "Nd", "Sm", "Zr", "Eu", "Ti", "Gd", "Tb", "Dy"), a = c(212, 65, 73, 49, 38, 26, 12, 25, 6, 6, 7, 6, 5), b= c(8, 10, 122, 95, 73, 55, 26, 4, 17, 1, 14, 9, 7)), .Names = c("Element", "a", "b"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L), class = "data.frame")
Chond_normal

Code 2:

library(ggplot2)
Chond_normal <- read.csv('file.csv', header=TRUE, sep=",")
attach(Chond_normal)
ggplot(data=Chond_normal, aes(ymin=0.1, ymax=1000, x=Element, y=Chond_normal[,2:26])) +
geom_path(data=Chond_normal, aes(y=a, x=Element), colour="black", size=1.0) +
geom_point(data=Chond_normal, aes(y=a, x=Element), colour="red", size=1.0) +
scale_y_log10("Sample / Chondrite", breaks=c(0.1, 1, 10, 100, 1000)) +
scale_x_discrete("", labels=Element)

Plot resulting from both code sets above. Line is not plotted and points does not plot 'sample a' data.

Holly Elliott
  • 103
  • 3
  • 10
  • Both of your dataset are not reproducible. At the first glance you need to `reshape2::melt` your data.frame before plot. And there is no need to set data in each of `geom_` functions if it isn't changed. – DrDom Sep 12 '14 at 13:02
  • Apologies, I have edited the dataset script but cannot get R to accept the list of elements as text rather than objects to be searched for. In essence the first dataset is 3 columns: Element, ppm, sample id. The second dataset is many columns: Element, ppm of sample a, b, c etc etc. The reason I have set the data in the geom_ is because I want to add many lines and points for each of the samples so the y= would change for each line. – Holly Elliott Sep 12 '14 at 13:26
  • @HollyElliott were you using `dput()` on he data.frames to make them reproducible? Because those were not copy/paste able into R. I've tried to reformat them to make them better, but let me know if i've goofed up at all. Also, it's not a good idea to use attach(). That gets very messy so i've commented that out and replaced the only code that I think as using it. Plus, is all this theme stuff necessary to reproduce the problem? I seems to just be added noise in the question. Try to provide *minimal* code to make a problem reproducible. – MrFlick Sep 12 '14 at 13:42
  • Actually went ahead and rolled back my changes. The data.frames still had problem with mismatching number of rows and rownames. See [how to make an R reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Sep 12 '14 at 13:49
  • I have edited code to try and make it minimal and dataset 2 is now fully reproducible, data set 1 I am still having issues with but code still gives appearance of data. Thank you for that! – Holly Elliott Sep 12 '14 at 14:26

1 Answers1

1

I'm not sure exactly why you're doing some of things you are in your ggplot code, but try something more like this:

Chond_normalised <- structure(list(Element = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 11L, 12L, 13L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
11L, 12L, 13L), .Label = c("Th", "Nb", "La", "Ce", "Pr", "Nd", 
"Sm", "Zr", "Eu", "Ti", "Gd", "Tb", "Dy"), class = "factor"), 
    ppm = c(212, 65, 73, 49, 38, 26, 12, 25, 6, 6, 7, 6, 5, 8, 
    10, 122, 95, 73, 55, 26, 4, 17, 1, 14, 9, 7, 41, 46, 74, 
    57, 49, 43, 28, 19, 20, 6, 18, 13, 9), Sample = c("a", "a", 
    "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "b", 
    "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", 
    "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", "c", 
    "c")), .Names = c("Element", "ppm", "Sample"), row.names = c(NA, 
-39L), class = "data.frame")

a <- Chond_normalised$Element[Chond_normalised$Sample == 'a']
Chond_normalised$Element <- factor(Chond_normalised$Element,levels = a)

ggplot(data=Chond_normalised[Chond_normalised$Sample == 'a',]) +
    geom_path(aes(y=ppm, x=Element,group = 1), colour="black", size=1.0) +
    geom_point(aes(y=ppm, x=Element), colour="red", size=1.0) +
    scale_y_log10("Sample / Chondrite", breaks=c(0.1, 1, 10, 100, 1000)) +
    scale_x_discrete("")

In general, avoid using [ inside of aes. aes does some fancy evaluation. If you need your data to be in a particular form, you need to do the manipulation outside of ggplot first. That will be much cleaner, and less prone to errors. In this case, just subset the data you want up front.

Control discrete variable order with factors and level ordering.

I'm not sure what you were trying to accomplish with ymin and ymax like that. I think maybe you actually wanted to use ylim() to set the plot limits? If you want to expand the plot, a good way to do that is to add a "dummy" data frame and use geom_blank.

The reason your data frame wasn't working well was because the row names attribute wasn't the right length. My guess is that you were trying to write the structure() call by hand, rather than simply relying on dput.

joran
  • 169,992
  • 32
  • 429
  • 468
  • That worked perfectly, thank you so much for that!!!! I have spent many hours puzzling over what part of the code was causing the problem. I put it together from several different examples I found a year or so ago and it worked on other data frames but didn't like this one. Thanks again Joran! – Holly Elliott Sep 15 '14 at 08:56