0

I'm writing a simple function that will create a new variable containing the sum of missing values of each column within a dataset. I am using the assign function to assign a variable name based on the input of the function.

report.NA <- function(v){
    nam <- deparse(substitute(v))
    newvar <-paste0(nam,"NAs")
    as.data.frame(assign(newvar,colSums(is.na(v)),envir=parent.frame()))
    message(paste("Sum of NAs in",nam,"dataset:",newvar),appendLF=FALSE)
}

For the sake of reproducibility:

set.seed(1)
df<-matrix(1,nrow=10,ncol=5)
dimnames(df)<-list(rownames(df),colnames(df,do.NULL=F))
df[sample(1:length(d), 10)] <- NA

Run the function on df, you get a new variable called dfNAs.

> dfNAs
col1 col2 col3 col4 col5 
   2    2    3    0    3 

The issue I am running into is that I want to have my output variable as a data.frame type. I know the obvious way of doing this outside of the function is just to run as.data.frame(dfNAs) but I would like to have function itself produce the new variable from assign as a data frame. I just wanted to see if there is a solution to this issue.

Also the overarching question is how to call the name from assign nested within a function so that and if it's even possible? I seems like a naive question but I haven't been able to find an answer yet.

EJJ
  • 1,474
  • 10
  • 17

1 Answers1

0

Not sure I understand what is desired but this reworking might point you in a favorable direction. Using as.list will convert a named vector to a multi-element named list which the ordinary data.frame function can accept to make multiple columns:

report.NA <- function(v){
    nam <- deparse(substitute(v))
    newvar <-paste0(nam,"NAs")
    assign(newvar,data.frame(as.list(colSums(is.na(v)))),envir=parent.frame())
    message(paste("Sum of NAs in",nam,"dataset:",newvar),appendLF=FALSE)
}
report.NA(df)
#Sum of NAs in df dataset: dfNAs

> dfNAs
  col1 col2 col3 col4 col5
1    2    2    3    0    3

> str(dfNAs)
'data.frame':   1 obs. of  5 variables:
 $ col1: num 2
 $ col2: num 2
 $ col3: num 3
 $ col4: num 0
 $ col5: num 3
IRTFM
  • 258,963
  • 21
  • 364
  • 487