-2
k<-21;

for(i in 5:k)
{
pharma[,i][pharma[,i]=="#N/A"]<- NA
pharma[,i][pharma[,i]=="NM"]<- NA
num<-sum(is.na(pharma[,i]))
n=1-num/length(pharma[,i])

if(n<0.8) {
rm(pharma[,i])
Else n=0
}
}

Basically trying to replace columns with NA and removing those where there are too many NA.

vj.vijay
  • 25
  • 1
  • 2
  • Can you expand the question to explain where the error is arising? – James Apr 05 '13 at 12:16
  • Read this! http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Richie Cotton Apr 05 '13 at 13:06
  • downvoting due to neglect ... will upvote if improved. – Ben Bolker Apr 08 '13 at 19:59
  • Folks, sorry was busy at work. But nailed the error. below are the codes: for(i in c(5:18)){ pharma[,i][pharma[,i]== "#N/A"]<- NA pharma[,i][pharma[,i]== "NM"]<- NA num<-sum(is.na(pharma[,i])) n<-1-num/length(pharma[,i]) if(n >0.95) pharma1<-merge(m,pharma[,c(1,4,i)],all=TRUE) m<-pharma1 } – vj.vijay Apr 10 '13 at 11:10

2 Answers2

2

You did not tell us the error the code creates. But several observations:

  • R is case sensitive. Else is not the same as else and Else is not correct
  • You do not close the if statement before the else.
  • There is no need to loop over the columns explicitly
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • Thanks Paul, found a way through, below is the new code: for(i in c(5:18)){ pharma[,i][pharma[,i]== "#N/A"]<- NA pharma[,i][pharma[,i]== "NM"]<- NA num<-sum(is.na(pharma[,i])) n<-1-num/length(pharma[,i]) if(n >0.95) pharma1<-merge(m,pharma[,c(1,4,i)],all=TRUE) m<-pharma1 } – vj.vijay Apr 10 '13 at 11:14
  • 1
    If this answers your question, I would add it as a separate answer, describe what you did to fix the situation, post the code, and accept it as the correct answer. – Paul Hiemstra Apr 10 '13 at 11:22
1

You probably want something like

## extract the columns to manipulate
pp <- pharma[,5:21]
## set relevant values to NA
pp <- lapply(pp,function(x) x[x %in% c("#N/A","NM")] <- NA)
## estimate fraction NA and test
badcols <- colMeans(is.na(pp))>0.2
## remove bad columns
pp <- pp[,!badcols]
## put the manipulated stuff back together with the original structure
pharma <- cbind(pharma[,1:4],pp)

but it's hard to tell exactly without a reproducible example.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453