0

I have a df such as:

name <- rep(c("a","b","c"),5)
QV.low <- runif(15, 2, 5)
QV.med <- runif(15, 5.0, 7.5)
QV.high <- runif(15, 7.5, 10)
df <-  as.data.frame(cbind(name, QV.low, QV.med,QV.high))

and a list of names:

name.list <- c("a","b")

I want to do an operation, eg:

df %>% 
    subset(name %in% name.list) %>%
    summarise(.,sum = sum(QV.low))

but I want to for each QV. variable via a loop.

I tried:

QV.list <- c("QV.low", "QV.med", "QV.high")
for(qv in 1:length(QV.list)){
    QV <- noquote(QV.list[qv])
    print(QV)
    df %>% 
        subset(name %in% name.list) %>%
        summarise(.,sum = sum(QV))
}

But it does not work.

How can I "extract" the character value from the QV.list in order to use it as df variable later?

TeYaP
  • 303
  • 6
  • 21
  • 2
    How is the name related to the df? what are you trying to subset? – Kozolovska Mar 14 '19 at 12:09
  • I have to subset my df. Here, as an example, I select only the part of the df which as its names present in a list of names. Then I try to obtain the sum of QV.low first, then QV.med, then QV.high – TeYaP Mar 14 '19 at 12:17

2 Answers2

2

You need to have at least 3 different names in namecol otherwise namecol %in% name.list1 is useless. If there's no filter and no pipe, there's no need for a loop. A simple colSums(df[,-1]) will do the job.

library(tidyverse)

QV.low <- runif(10, 2, 5)
QV.med <- runif(10, 5.0, 7.5)
QV.high <- runif(10, 7.5, 10)
namecol <- sample(c("a","b", "c"), 10, replace = T)
df <-  data.frame(namecol, QV.low, QV.med,QV.high)
df
name.list1  <- c("a","b") # select some names

QV.list <- c("QV.low", "QV.med", "QV.high")

for(i in QV.list){
  QV <- noquote(i)
  print(QV)
  qv <- sym(i)
  print(df %>% 
    filter(namecol %in% name.list1) %>%
    summarise(sum = sum(!!qv)))
}

will give you

[1] QV.low
     sum
1 29.093
[1] QV.med
       sum
1 61.07034
[1] QV.high
       sum
1 86.02611
Joe McMahon
  • 3,266
  • 21
  • 33
LAP
  • 6,605
  • 2
  • 15
  • 28
  • 1
    I edited my question, part of df was forgotten. btw, the `sym()` seems to be what I was looking for. Do you have a short explaination for `!!` in front of qv pls. – TeYaP Mar 14 '19 at 12:45
  • 2
    The `!!` is part of `dplyr`'s non-standard evaluation. See here for further information: https://stackoverflow.com/questions/26724124/standard-evaluation-in-dplyr-summarise-on-variable-given-as-a-character-string – LAP Mar 14 '19 at 12:50
  • what if I want to change the name "`sum`" in the extraction by `QV.low`, `QV.med` and `QV.high`? – TeYaP Mar 14 '19 at 16:14
  • 1
    Use `summarise(!!QV := sum(!!qv)))` instead (see [here](https://stackoverflow.com/questions/26003574/dplyr-mutate-use-dynamic-variable-names) why). – LAP Mar 15 '19 at 07:09
0

if I understood your problem you can resolve with this:

for( name in names(df)){
  df[,name]
  ....
  df %>% summarise(.,sum = sum(df[,name]))
}
LorNap
  • 51
  • 3