2

I'm using lapply() with a function but, instead of get an only one name of a variable in each iteration, I'm getting every variable names from my data set.

colnames(data) <- c("var1", "var2")

varnames <- function(var, name){
   return(name)
}

print(lapply(data, varnames, name=names(data)))

I get this output:

$var1
[1] "var1" "var2"

$var2
[1] "var1" "var2"

But, I'd like to get:

$var1
[1] "var1"

$var2
[1] "var2"
zx8754
  • 52,746
  • 12
  • 114
  • 209
Gabriel Quesada
  • 163
  • 2
  • 3
  • 9
  • You have a 2 column dataset, when you do `lapply`, it is looping through the columns i.e. a list of vectors. You are getting every variable names, before you specified in that way `name = names(data)`. – akrun Oct 24 '16 at 14:00
  • @akrun I see... but how can I get just a variable name instead of every variable names? Name of the first variable the first time, the second one after... – Gabriel Quesada Oct 24 '16 at 14:08
  • Would `as.list(colnames(data))` do what you want? – Phil Oct 24 '16 at 14:12
  • You can use `lapply` on `names(data)` and use function such as `function(x) { print(x); print(dt[[x]]) }` to manipulate the data. – m-dz Oct 24 '16 at 14:15

1 Answers1

0

What you specify here in code is:

lapply(data, varnames, name=names(data))

loop over the columns of data, and provides an additional variable name to the function varnames. Note that here the code does not iterate over name, but passes it as is.

If you want to get the output you require, you can use mapply. This is essentially a version of lapply which allows multiple variables to be iterated over:

mapply(varnames, data, names(data)) 
  var1   var2 
"var1" "var2" 

To do some operations to each of the columns, and still keep the name in some way, I would use dplyr (as you are not entirely clear what your end goal is, I'm guessing this is what your where after):

library(dplyr)
data = data.frame(var1 = runif(10), var2 = runif(10), 
                  var3 = runif(10), var4 = runif(10))
data %>% summarise_each(funs(mean, sd, median), var1:var4)
  var1_mean var2_mean var3_mean var4_mean   var1_sd   var2_sd  var3_sd
1 0.6063735 0.5308427 0.2872901 0.6043027 0.2586042 0.2303065 0.245709
    var4_sd var1_median var2_median var3_median var4_median
1 0.2721362   0.6814136   0.4535982    0.200493    0.644607

alternatively using gather from tidyr:

data %>% 
   gather(variable, value) %>% 
   group_by(variable) %>% 
     summarise_each(funs(mean, sd, median), value)
Source: local data frame [4 x 4]

  variable      mean        sd    median
    (fctr)     (dbl)     (dbl)     (dbl)
1     var1 0.6063735 0.2586042 0.6814136
2     var2 0.5308427 0.2303065 0.4535982
3     var3 0.2872901 0.2457090 0.2004930
4     var4 0.6043027 0.2721362 0.6446070
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • I have a dataframe with 40 variables and I want to use a function which draw a barplot for each variable. I'd like to use the name of each variable as title of each barplot. That's my goal. – Gabriel Quesada Oct 24 '16 at 14:48