2

snippet of data.frame snippet of list of data.frames

My list looks like this:

> dput(lapply(list1, head))
list(structure(c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), .Dim = c(6L, 
4L), .Dimnames = list(NULL, c("FPAR", "gpp", "LAI", "npp"))), 
    structure(c(63L, 83L, 66L, 84L, 92L, 85L, 9976L, 3318L, 9456L, 
    9435L, 9002L, 9395L, 21L, 32L, 18L, 34L, 50L, 36L, 6742L, 
    5228L, 5405L, 5136L, 5387L, 5339L), .Dim = c(6L, 4L), .Dimnames = list(
        NULL, c("FPAR", "gpp", "LAI", "npp"))), structure(c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA), .Dim = c(6L, 4L), .Dimnames = list(
        NULL, c("FPAR", "gpp", "LAI", "npp"))), structure(c(0L, 
    95L, 0L, 82L, 0L, 82L, 10306L, 10205L, 10306L, 10627L, 10306L, 
    10627L, 0L, 64L, 0L, 31L, 0L, 31L, 6396L, 6340L, 6396L, 6396L, 
    6396L, 6396L), .Dim = c(6L, 4L), .Dimnames = list(NULL, c("FPAR", 
    "gpp", "LAI", "npp"))), structure(c(NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    7331L, NA, NA), .Dim = c(6L, 4L), .Dimnames = list(NULL, 
        c("FPAR", "gpp", "LAI", "npp"))), structure(c(96L, 96L, 
    100L, 88L, 93L, 100L, 12734L, 12215L, 11383L, 11446L, 12672L, 
    11531L, 61L, 61L, 70L, 40L, 58L, 70L, 7807L, 7357L, 7695L, 
    6400L, 6009L, 7735L), .Dim = c(6L, 4L), .Dimnames = list(
        NULL, c("FPAR", "gpp", "LAI", "npp"))), structure(c(95L, 
    95L, 96L, 96L, 96L, 95L, 10829L, 10829L, 9652L, 10321L, 9652L, 
    10829L, 61L, 61L, 62L, 65L, 62L, 61L, 5144L, 5144L, 5188L, 
    5422L, 5188L, 5144L), .Dim = c(6L, 4L), .Dimnames = list(
        NULL, c("FPAR", "gpp", "LAI", "npp"))), structure(c(NA, 
    NA, 87L, 63L, 87L, 87L, NA, NA, 7891L, 7891L, 0L, 7891L, 
    NA, NA, 46L, 19L, 46L, 46L, NA, NA, NA, NA, NA, NA), .Dim = c(6L, 
    4L), .Dimnames = list(NULL, c("FPAR", "gpp", "LAI", "npp"
    ))), structure(c(100L, 100L, 100L, 78L, 100L, 100L, 4011L, 
    4011L, 4112L, 4306L, 3664L, 4112L, 70L, 70L, 70L, 21L, 70L, 
    70L, 2253L, 2425L, 2479L, NA, 2253L, 2253L), .Dim = c(6L, 
    4L), .Dimnames = list(NULL, c("FPAR", "gpp", "LAI", "npp"
    ))), structure(c(83L, 87L, 71L, NA, 0L, 82L, 5627L, 6626L, 
    5862L, NA, 6963L, 5541L, 29L, 37L, 16L, NA, 0L, 28L, 3877L, 
    3695L, 3635L, 1642L, 3692L, 3644L), .Dim = c(6L, 4L), .Dimnames = list(
        NULL, c("FPAR", "gpp", "LAI", "npp"))), structure(c(88L, 
    89L, 61L, NA, 0L, 88L, 6132L, 5636L, 6166L, NA, 6510L, 6502L, 
    49L, 51L, 18L, NA, 0L, 50L, 3413L, 3788L, 3519L, NA, 3463L, 
    3755L), .Dim = c(6L, 4L), .Dimnames = list(NULL, c("FPAR", 
    "gpp", "LAI", "npp"))))

So I'm trying to efficiently plot scatterplots between variable 1 (FPAR) and variables 2-4 (gpp, LAI, npp) for 11 dataframes in a list. I've attached a snippet of what my list looks like.

This is what I've come up with so far, but it isn't working.

library(ggplot2)
scatterplot <- function(yvar){
  ggplot(list1, aes_(x=~age, y=as.name(yvar))) + 
    geom_point() +
    facet_wrap(~dataframe)
}
lapply(list1, scatterplot)

This is my first time working with lists and ggplot, so I got a little stuck..

Gurt
  • 25
  • 7
  • 2
    Can you show a [minimal example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of what your list looks like? Do they share the same columns names? If so, use `purrr::bind_cols` and its `.id` argument to create a single data frame. – markus Nov 01 '18 at 10:47
  • I've added a snip, which is the quickest way I think. they have different column names as they represent different indices. – Gurt Nov 01 '18 at 11:08
  • 1
    Better share `dput(lapply(list1, head))`, not pictures. – markus Nov 01 '18 at 11:13
  • Okay, thanks! I added it, it's quite long though :) – Gurt Nov 01 '18 at 11:16
  • Way better! Your data frames or matrices and share the same column names actually. Where is the fifth variable you mention in your question title? And more importantly: what do you want to put on the x axis and what on the y-axis etc.? – markus Nov 01 '18 at 11:18
  • Thanks! I am trying to plot FPAR on x-axis and then three seperate plots with gpp, LAI and npp on y-axis. For all 11 data.frames. If that makes sense – Gurt Nov 01 '18 at 11:20

2 Answers2

2

Here is a way you could do this.

library(tidyverse)
list1 %>% 
  map(., as_data_frame) %>% 
  bind_rows(., .id = "dataframe") %>%
  gather(., key, value, gpp:npp) %>% 
  mutate(dataframe = factor(dataframe, levels = unique(dataframe))) %>% 
  ggplot(., aes(FPAR, value, col = key)) +
  geom_point() +
  facet_wrap(. ~ dataframe, ncol = 3)

warning: Removed 71 rows containing missing values (geom_point).

enter image description here


You might want to read:

markus
  • 25,843
  • 5
  • 39
  • 58
  • 1
    Edit: I made a typo, sorry! This works, although I have no idea how. So will be researching that, many thanks for the help! – Gurt Nov 01 '18 at 11:47
  • @Gurt. Glad it worked. What parts don't you understand? I might edit and give some explanation. – markus Nov 01 '18 at 11:49
  • The `.` represents the data you use in a function. So yes, it does represent your list in `map` and in `bind_rows`. After you called `bind_rows` we deal with a data frame and not a list anymore. You can omit `.` if you like. Here can you read more about it: https://magrittr.tidyverse.org/#the-argument-placeholder ... – markus Nov 01 '18 at 11:59
  • ... `mutate` is a function from the `dplyr` package that is here used to convert the column `dataframe` to a factor. This is useful if you want to specify the order of a variable. If you would omit that line you'd see that ggplot would order your panels like so: `1 10 11 2 3 4 5 6 7 8 9` from top left to bottom right. – markus Nov 01 '18 at 12:02
1

enter image description here

This does not answer your question but might get you a bit further along. You need to think about the inputs of your function. If list1 is a list of data.frames the the input of your scatterplot function will be a data.frame. Depending on your data.frame structure your function could look like:

df <- data.frame(a = rnorm(n = 20),
                 b = rnorm(n = 20),
                 d = c(rep('first', 10),
                       rep('sec', 10)))
df_l <- list(df, df)

scatterplot <- function(df){
  ggplot(df, aes(x = a, y = b)) + 
           geom_point() +
           facet_wrap(~d)
} 

lapply(df_l, scatterplot)

good luck :)

mrjoh3
  • 437
  • 2
  • 11
  • Thanks! But I still don't quite understand what the facet_wrap does? I have seen an example with days and months, but I would like my plots to be grouped by data.frame? – Gurt Nov 01 '18 at 11:13
  • facet_wrap creates subplots for a single `data.frame` based on a column. The ggplot2 docs are pretty good for explaining and examples https://ggplot2.tidyverse.org/reference/facet_wrap.html. Also note that using lapply results in two plots, each with 2 subplots. – mrjoh3 Nov 01 '18 at 11:18
  • Actually, trying out your answer, this wouldn't work for my dataset. I have melted one of my data.frames, but then FPAR ends up in the same column as my other variables – Gurt Nov 01 '18 at 11:29
  • Check the docs for melt. From memory I think you need `melt(df, id.var = "FPAR")` – mrjoh3 Nov 01 '18 at 11:35