1

I have a large dataset (14295,58). Each column is a different element from the periodic table (e.g. Fe, Ca, Zr) and the rows are arranged according to depth (in mm); the last column is the depth value. I am trying to make a code that can be customized to a given group of elements over a given depth interval but I don't want to have to go through and change a bunch of lines of code everytime I look at a different subset. So far I have created a dataframe called Section:

Section <- df[50:100,]

and a vector called Elements:

Elements <- c("Fe", "Ca", "Zr")

I can subsample the Section data frame by:

Section %>%
select(., Elements, depth)

but now I want to plot this with ggplot and I can't figure out how to call the Elements vector to the x-variable. I tried:

Section %>%
select(., Elements, depth) %>%
ggplot() +
geom_path (aes(Elements, depth))

but the arguments don't have the same length. How can I plot the selected elements from the Elements vector?

JJGabe
  • 383
  • 1
  • 2
  • 10
  • 1
    please share your data (or a subset) so people can provide solutions – Mike Nov 14 '18 at 17:00
  • Welcome to SO. We can't help you really when we are not able to reproduce your code. have a look how to ask a good question. You won't get a good answer without a good question. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example https://stackoverflow.com/help/mcve – tjebo Nov 14 '18 at 17:22
  • `d <- seq(0, 100, 0.5) Fe <- runif(201, min = 0, max = 1000) Ca <- runif(201, min = 0, max = 1000) Zr <- runif(201, min = 0, max = 1000) Ti <- runif(201, min = 0, max = 1000) Al <- runif(201, min = 0, max = 1000) example <- data.frame(d, Fe, Ca, Zr, Ti, Al)` – JJGabe Nov 14 '18 at 17:30
  • Sorry its not in a great format. I'm not sure the best way to provide a dataset after the question has been asked – JJGabe Nov 14 '18 at 17:32

1 Answers1

1

I think your problem is actually that your data is not formatted in the most useful way (wide vs. long), so you aren't actually giving ggplot what you think you are. If you give it a vector as an aesthetic (Elements here), it will try its best to plot it. In this case, it will do it if the length matches by just matching up values in depth to things in Elements. So this works:

# Toy Data
df <- data.frame(O = 1:3,
                 Fe = 2:4,
                 Ca = 3:5,
                 Zr = 4:6,
                 depth = 5:7)
Elements <- c('Fe', 'Ca', 'Zr')
ggplot(df) +
    geom_point(aes(x=Elements, y=depth))

But it just matches the first depth to 'Fe', the second depth to 'Ca', etc. I don't think that's what you are hoping to have happen.

Long vs Wide Data

You have separate columns for every all these elements, but do they actually represent different things? You are probably better off re-formatting your data so that all these "element" columns get collapsed into key-value pairs using tidyr:

# Wide:
df
  O Fe Ca Zr depth
1 1  2  3  4     5
2 2  3  4  5     6
3 3  4  5  6     7

# Long
library(tidyr)
longDf <- tidyr::gather(df, element, amount, -depth)
longDf
       depth element amount
1      5       O      1
2      6       O      2
3      7       O      3
4      5      Fe      2
5      6      Fe      3
6      7      Fe      4
7      5      Ca      3
8      6      Ca      4
9      7      Ca      5
10     5      Zr      4
11     6      Zr      5
12     7      Zr      6

Now you can get the elements you want using dplyr's filter (which is also probably a better option for subsetting by depth) and use the new element column as the x coordinate for plotting:

longDf %>% 
    filter(element %in% Elements) %>%
    ggplot() +
        geom_path(aes(x=element, y=depth))

I'm not sure what you're expecting the graph to look like, but that should get you started.

Taiki Sakai
  • 311
  • 1
  • 5
  • 1
    Thanks @Taiki Sakai! That does help to get me started. My issue now is that I want each graphed line to be a different colour. When I added to geom_path `geom_path(aes(x=amount, y=depth), colour = element)` I got an error saying "object 'element' not found". However, if I run `test <- longDf %>% filter(element %in% Elements)` the resulting dataframe has variable as a column. Any thoughts? – JJGabe Nov 14 '18 at 18:10
  • 1
    Strike that last question, I had the colour operation in the wrong bracket – JJGabe Nov 14 '18 at 18:18
  • 1
    You need to put `colour = element` inside the `aes` part of the code. This is where `ggplot` looks for the names of columns in the data.frame you supplied as `data`, outside of the `aes` part it will look for objects in your global environment. – Taiki Sakai Nov 14 '18 at 18:21