0

Let's say I have a data frame like this one:

>jnk1
   V1    V2    V3  V4  V5   V6  V7  V8  V9  V10  V11  V12    
1  23207 NA   450  NA  602  NA  NA  NA  NA  456  782  7439
2  1506  NA  1101  NA  303  NA  NA  NA  NA  378  378  8922
3  234   NA   456  NA  507  NA  NA  NA  NA  2384 333  6282  
4  672   NA   345  NA  918  NA  NA  NA  NA  332  322  3782  
5  980   NA   295  NA  972  NA  NA  NA  NA  2782 344  2789
6  459   NA   218  NA  810  NA  NA  NA  NA  3278 378  2782

I want to write a script which would remove all the columns containing "NA" and return a data frame with the rest of the columns. For eg, I want the output to be a data frame which contains only the columns V1, V3, V5, V10, V11, V12 and all the rows

I wrote the following code for it:

for (i in 1:ncol(jnk1)){
if (unique(is.na(jnk1[i]))==TRUE)
jnk1[i]<-NULL}

But R is showing me an error:

Error in `[.data.frame`(jnk2, i) : undefined columns selected

and removing only some columns with NA. If I run the script again, it is again showing me the same error and finally removing all the desired columns.

What is it that I'm missing here?

Debjyoti
  • 177
  • 1
  • 1
  • 9
  • https://stackoverflow.com/questions/2643939/remove-columns-from-dataframe-where-all-values-are-na – Ronak Shah Oct 19 '20 at 09:14
  • i think you are missing a comma before the ```i```, as in ```jnk1[,i]``` in the loop. You if clause is also dangerous. I would go with ```all``` or ```any``` instead of ```unique``` – SebSta Oct 19 '20 at 09:16

0 Answers0