How to update the column-name without entering a string in R

Question

I am working with a WHO macro for transforming anthropometric parameters into Z-scores.

For the purpose of the question, calling the who2007 function requires us to give the name of the data frame and then only the name of the variables (columns) just like in ggplot function. The problem with this is, say if the column name is Age entering argument=Age is different from entering argument='Age'. The former returns a double but the latter returns a list. I am assuming it is the difference of doing df$Age vs df['Age'].

If I have a vector of just the column names and I need to iterate over the same code using different columns each time, if I sequentially enter the respective entries of that character vector, the function throws an error since it encounters a list instead of a double internally. How do I circumvent this? One way I can think of is using the column-numbers or using any grep methods to identify the column numbers, but is there another better method?

ADDENDUM

Here is the function source code (a part of it which I think might explain the problem)

who2007 <- function(FileLab="Temp",FilePath="C:\\Documents and Settings",mydf,sex,age,weight,height,oedema=rep("n",dim(mydf)[1]),sw=rep(1,dim(mydf)[1])) {

#############################################################################
###########   Calculating the z-scores for all indicators
#############################################################################

   old <- options(warn=(-1))

   sex.x<-as.character(get(deparse(substitute(mydf)))[,deparse(substitute(sex))])
   age.x<-as.double(get(deparse(substitute(mydf)))[,deparse(substitute(age))])
   weight.x<-as.double(get(deparse(substitute(mydf)))[,deparse(substitute(weight))])
   height.x<-as.double(get(deparse(substitute(mydf)))[,deparse(substitute(height))])
   if(!missing(oedema)) oedema.vec<-as.character(get(deparse(substitute(mydf)))[,deparse(substitute(oedema))]) else oedema.vec<-oedema
   if(!missing(sw)) sw<-as.double(get(deparse(substitute(mydf)))[,deparse(substitute(sw))]) else sw<-as.double(sw)
   sw<-ifelse(is.na(sw),0,sw)

    sex.vec<-NULL
   sex.vec<-ifelse(sex.x!="NA" & (sex.x=="m" | sex.x=="M" | sex.x=="1"),1,ifelse(sex.x!="NA" & (sex.x=="f" | sex.x=="F" | sex.x=="2"),2,NA))
    age.vec<-age.x
    height.vec<-height.x
   oedema.vec<-ifelse(oedema.vec=="n" | oedema.vec=="N","n",ifelse(oedema.vec=="y" | oedema.vec=="Y","y","n"))

   mat<-cbind.data.frame(age.x,as.double(sex.vec),weight.x,height.x,oedema.vec,sw,stringsAsFactors=F)
    names(mat)<-c("age.mo","sex","weight","height","oedema","sw")

    mat$cbmi<-mat$weight/((height.vec/100)^2)
    mat$zhfa<-NULL
    mat$fhfa<-NULL
    mat$zwfa<-NULL
    mat$fwfa<-NULL
    mat$zbfa<-NULL
    mat$fbfa<-NULL

#############################################################################
###########   Calculating the z-scores for all indicators
#############################################################################

cat("Please wait while calculating z-scores...\n") 

### Height-for-age z-score

mat<-calc.zhfa(mat,hfawho2007)

### Weight-for-age z-score

mat<-calc.zwei(mat,wfawho2007)

### BMI-for-age z-score

mat<-calc.zbmi(mat,bfawho2007)


#### Rounding the z-scores to two decimals

            mat$zhfa<-rounde(mat$zhfa,digits=2)
            mat$zwfa<-rounde(mat$zwfa,digits=2)
            mat$zbfa<-rounde(mat$zbfa,digits=2)

#### Flagging z-score values for individual indicators

            mat$fhfa<-ifelse(abs(mat$zhfa) > 6,1,0)
            mat$fwfa<-ifelse(mat$zwfa > 5 | mat$zwfa < (-6),1,0)
            mat$fbfa<-ifelse(abs(mat$zbfa) > 5,1,0)

if(is.na(mat$age.mo) & mat$oedema=="y") {
mat$fhfa<-NA
mat$zwfa<-NA
mat$zbfa<-NA
}

mat<-cbind.data.frame(mydf,mat[,-c(2:6)])

ADDENDUM 2

The script is also intended to be run by ultiple users, where modifying the source code for them might not be possible. Is there a way to not need to modify the function source code?

Please provide example of your function `who2007`, it doesn't have to be complete function, simple enough so we can see input arguments, and data subsetting part. I am guessing you could pass colnames as character vector, and inside your function use `lapply` to loop through the columns? — zx8754, Mar 16 '18 at 08:21
@zx8754 I added the function. I did not get the second part of your comment. — stochastic13, Mar 16 '18 at 08:26
I guess you can rewrite the function to accept strings to suit your need instead of using a series of `get` & `deparse(substitute())` — Tung, Mar 16 '18 at 08:33
@Tung I do not know the usage of Deparse and the argument-handling as written in the function. Can you let me know what to change or what the current code does so that I can modify it? — stochastic13, Mar 16 '18 at 08:34
https://stackoverflow.com/questions/45176431/extract-name-of-data-frame-in-r-as-character/45176503 — Tung, Mar 16 '18 at 08:44

score 1 · Answer 1 · answered Mar 16 '18 at 09:03

1

We could test if the input dataframe has required columns, then get rid of "deparse get" step, e.g.:

who2007 <- function(FileLab = "Temp", FilePath = "C:\\Documents and Settings",
                    mydf,
                    oedema = rep("n",dim(mydf)[1]),sw=rep(1,dim(mydf)[1])) {

  if(!all(c("sex", "age", "weight", "height") %in% colnames(mydf))) stop("mydf, must have 'sex', 'age', 'weight', 'height' columns")

  sex.x <- mydf$sex
  age.x <- mydf$age
  # ...
  # some code
  # ...

  #return
  list(sex.x, age.x)
}

Testing:

#example dataframe   
x <- head(mtcars)

# this errors as required columns are missing
who2007(mydf = x)
# Error in who2007(mydf = x) : 
#   mydf, must have 'sex', 'age', 'weight', 'height' columns

# now update columns with required column names, and it works fine:
colnames(x)[1:4] <- c("sex", "age", "weight", "height")
who2007(mydf = x)
# [[1]]
# [1] 21.0 21.0 22.8 21.4 18.7 18.1
# 
# [[2]]
# [1] 6 6 4 6 8 6

answered Mar 16 '18 at 09:03

zx8754

52,746
12
114
209

Thanks. What exactly does `deparse get` do, seeing that its presence seems redundant? – stochastic13 Mar 16 '18 at 09:05
@SatwikPasani Try: `deparse(substitute(sex))` – zx8754 Mar 16 '18 at 09:09
And there is no argument to put the name of the column (updated across iterations) in the function you mention. Am I missing something? Still a beginner at R – stochastic13 Mar 16 '18 at 09:13
Also, can you check the addendum 2 in the question? – stochastic13 Mar 16 '18 at 09:17
@SatwikPasani Not sure I understand, you want to fix the problem without doing any changes to the function? – zx8754 Mar 16 '18 at 09:39
If possible, yes. But now that I am reading up on substitute and deparse, it seems unlikely. And in the function you wrote, where to enter the iteratively updated column name? – stochastic13 Mar 16 '18 at 09:43

How to update the column-name without entering a string in R

1 Answers1