0

I want to add columns to a survey.design created with the survey package, which can be done as following:

library(survey)
data(api)

dclus1 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
dclus2 <- transform(dclus1, 
                    api00_b = api00 + 1)

svymean(~ api00, design = dclus2)
#>         mean     SE
#> api00 644.17 23.542
svymean(~ api00_b, design = dclus2)
#>           mean     SE
#> api00_b 645.17 23.542

For a more complex task, I need to create these variable names dynamically from external vectors. The following produces an error, but I think provides an illustration of what I want to achieve:

vars <- c("api00_a", "api00_b")
dclus2 <- transform(dclus1, 
                    vars[[2]] = api00 + 1)

How could dynamic names for the new columns be implemented?

Crimc
  • 195
  • 17
  • possible to edit the data frame `apiclus1` before the `svydesign` creation? `apiclus1[ , vars ] <- apiclus1[ , "api00" ] + 1` .. or https://stackoverflow.com/a/16225175/1759499 ? – Anthony Damico Dec 23 '21 at 07:22
  • It is not possible to edit `apiclus1` before (because I need to create new columns using the survey design `dclus1`). I did try to make this with combinations of `eval()` , `quote()` `get()` and `assign()` without success (but perhaps there is a way with those) – Crimc Dec 23 '21 at 15:03
  • how about `lapply( c( "meals" , "ell" ) , function( w ) svymean( ~ newvar , update( dclus1 , newvar = get( w ) + 1 ) ) )` ? – Anthony Damico Jan 01 '22 at 21:56

3 Answers3

1

Here's a possible solution using purrr:

library(purrr)

vars <- c("api00_a", "api00_b")

transform_func <- function(data, vars) {
  transform(data, vars = api00 + 1)
}

map(vars, ~transform_func(dclus1, .))

Which gives us the following list:

[[1]]
1 - level Cluster Sampling design
With (15) clusters.
update(`_data`, ...)

[[2]]
1 - level Cluster Sampling design
With (15) clusters.
update(`_data`, ...)
Matt
  • 7,255
  • 2
  • 12
  • 34
  • In this solution a list with two survey.design objects is created with the same new `vars` column in each. I need one single survey.design where the new column is named `api00_b` (or any other name in an external character vector). – Crimc Dec 22 '21 at 03:06
  • Can you provide your expected output? I'm not entirely sure what you mean – Matt Dec 22 '21 at 03:25
  • Like in the OP example, `dclus2` should have a new column called `api00_b`, but this column name should be created from an external vector (`vars` in the OP) – Crimc Dec 22 '21 at 04:29
1

I don't think you can use a vector like this on the left-hand side of the equal sign in R. You don't have to use transform, which calls survey:::update.survey.design, though. You could just add your new variable directly:

dclus2 <- dclus1
dclus2$variables[ ,vars[[1]]] <- dclus2$variables[,"api00"] + 1

This is the same as creating the new variable before converting to a survey.design object, as long as you do not use any survey functions for creation of the new variable. Just using Anthony's comment:

apiclus2 <- apiclus1
apiclus2[ , vars[[1]]] <- apiclus2[ , "api00" ] + 1
dclus_prep_2 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus2, fpc = ~fpc)

You might prefer to use srvyr, which allows your kind of programming with dplyr's !!and :=:

library(srvyr)
dclus_srvyr_1 <- as_survey_design(.data = apiclus1, 
                                ids = dnum, 
                                weights = pw, 
                                fpc = fpc)
dclus_srvyr_2 <- mutate(dclus_srvyr_1, 
                    !!vars[[1]] := api00 + 1)

All versions have the same result:

lapply(list(dclus2, dclus_prep_2, dclus_srvyr_2), 
  function(design) svymean(~api00_a, design=design))
[[1]]
          mean     SE
api00_a 645.17 23.542

[[2]]
          mean     SE
api00_a 645.17 23.542

[[3]]
          mean     SE
api00_a 645.17 23.542
Mathdragon
  • 92
  • 11
1

You can do this with bquote. For example

vars <- c("api00_plus_1", "api00_plus_2")
exprs<-list(quote(api00+1),quote(api00+2))
names(exprs)<-vars
bquote(update(dclus1,..(exprs)), splice=TRUE)

eval(bquote(update(dclus1,..(exprs)), splice=TRUE))

Here's another chunk from inside the survey package that converts any string variables mentioned in a formula to factor

strings_to_factors<-function(formula,  design){
    allv<-intersect(all.vars(formula), colnames(design))
    vclass<-sapply(model.frame(design)[,allv,drop=FALSE], class)
    if (!any(vclass=="character")) return(design)
    vfix<-names(vclass)[vclass=="character"]
    l<-as.list(vfix)
    names(l)<-vfix
    fl<-lapply(l, function(li) bquote(factor(.(as.name(li)))))
    expr<-bquote(update(design, ..(fl)), splice=TRUE)
    eval(expr)
}
Thomas Lumley
  • 1,893
  • 5
  • 8