0

I have this function that works close to what I need -- it creates a clean table from my original raw data, makes it a ggplot, and uses lapply to run it through all the variables I want from the original table, data:

#Get colnames of all numeric varaibles
nlist <- names(data[,sapply(data,is.numeric)])

#Create function
varviz_n <- function(dat, var){
  var <- dat[,which(names(dat) == var)]

  title<-var

  tab <- dat %>%
    group_by(group = cut(var, breaks = seq(0, max(var), 10)),
             groupedsupport) %>%
    summarise(n = n()) %>%
    mutate(freq = n / sum(n)) %>%
    filter(!is.na(group),n>10)

  tab2 <- tab %>%
    group_by(groupedsupport) %>%
    summarise(mean = mean(freq),
              median = median(freq))

  finaltab <- tab %>% left_join(tab2, by = "groupedsupport")

  fplot <- finaltab %>%
    ggplot(aes(fill=group,x=groupedsupport,y=freq)) +
    geom_col(position="dodge") +
    geom_text(aes(label = paste("n =",n), n = (n + 0.05)), position = position_dodge(0.9), vjust = 0, size=2) +
    geom_errorbar(aes(groupedsupport, ymax = median, ymin = mean),
                  size=0.5, linetype = "longdash", inherit.aes = F, width = 1) +
    scale_y_continuous(labels = scales::percent) +
    xlab("") + ylab("") +
    ggtitle(title) + 
    scale_fill_discrete("")

filename = filename <- paste0(finaltab$var)
ggsave(paste("Plots/",filename,".png"), width = 10, height = 7)

  return(fplot)
}

#Run function
lapply(nlist, varviz_n, dat = data)

This does almost exactly what I want -- the problem is that all of the variables it's running through are 0-100 numeric and it's creating the plots but I can't at all figure out how to get the column name as the title of the plot or of the key. So I have no idea which graph is getting returned.

Can someone please help me figure out a way to get the column name from nlist to be the title of my plot? The way it is now prints out the first value of the column instead of the actual column name:

Exactly the graph I want with no title

The final piece of code to save it in the 'Plots' folder doesn't work either since the title/var isn't populating correctly.

You can use something like this to create data to test out the code: data <- data.frame(v1 = sample(1:100,1000,replace=T),v2 = sample(1:100,1000,replace=T),v3 = sample(1:100,1000,replace=T),groupedsupport = sample(LETTERS[1:3],1000,replace = TRUE))

Thanks!

Ryan
  • 501
  • 1
  • 12
  • 26
  • 1
    Whenever you are using `lapply` (which operates on *one list*) and think you want to operate simultaneously on the corresponding elements in a second list/vector, think `Map`. See a previous related answer about `Map` here: https://stackoverflow.com/a/54485425/3358272 – r2evans Jun 05 '19 at 22:19
  • Take a look at `imap` or `iwalk` https://stackoverflow.com/a/52902877/786542 & https://stackoverflow.com/a/55114232/786542 – Tung Jun 05 '19 at 22:48
  • @r2evans thanks! Are you saying I should use `Map` _instead_ of my `lapply` or is there a way to add it somewhere to my existing script just for the plot titles? – Ryan Jun 05 '19 at 23:14
  • 1
    try flipping the two first lines of the function so that `title` gets assigned before you change the value of `var` – GordonShumway Jun 05 '19 at 23:18
  • @Ryan, *replace*. `lapply(mydat, myfunc)` is analogous to `Map(myfunc, mydat)`, but if `myfunc2` takes two arguments, then `Map(myfunc2, mydat1, mydat2)` has no *direct* equivalent in `lapply` (though it can be crudely approximated with `lapply(seq_len(mydat1), function(i) myfunc2(mydat1[i], mydat[2]))`). – r2evans Jun 05 '19 at 23:22

1 Answers1

1

I think you just need to swap these steps:

var <- dat[,which(names(dat) == var)]
title <- var

should be

title <- var
var <- dat[,which(names(dat) == var)]

var being assigned to the column of selected data so when it is called again in title, it is looking at that vector and not the column name.

If this doesn't resole it, please give us some code to mimic the contents of data.

yake84
  • 3,004
  • 2
  • 19
  • 35
  • Actually OP is overwriting the *var* parameter, an ill-advised approach. Keep *var* as character param and use a different variable for data frame object or simply use *title* as function param: `varviz_n <- function(dat, title){ var <- dat[,which(names(dat) == title)] ...` – Parfait Jun 07 '19 at 01:06