0

I have the following table that I generated using the table(data$a, data$b) function

      a  b  c  NA
d  0  45 42 63 0
e  0  12 45 63 0
f  0  95 65 21 0
NA 0  0  0  0  0

How can I remove the columns with " " and NA?

Here is a reproducible example

a b
a d
a d
a d
a d
a d
a d
a d
a d
a d
a d
a d
a d
b d 
b d
b e
b e
b e
b e
c e
c e
c e
c e
c e
c e
c e
c e
c e
c e
c f
c f
c f
c f
c f
c f
c f
c f
c f
c f
c f

Note that there are no "" or NAs in the set, but they still appear in the table

In this table, both of the variables are factors.

Thank you!

Jennifer
  • 285
  • 1
  • 3
  • 14
  • 1
    You should provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – M-- Dec 03 '19 at 21:57
  • @Jennifer Please check my update. It would be unused levels issue – akrun Dec 03 '19 at 22:09

2 Answers2

1

It is possible that the NAs are character strings "NA" instead of NA, otherwise, the table would pick up with default useNA= "no" and remove it. One option is to change the values '' and "NA" to NA

df1[df1 == "NA"|df1 == ""] <- NA

Assuming that we have two column dataframe and all of the columns are character class

Update

If the dataset have "NA" or "", it would be a factor class column with unused levels already existing. One option is droplevels and then apply the table

table(droplevels(df1))
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662
1

If we create a table called "mytable", you could try the following:

bad_cols <- which(colnames(mytable) ==  "NA" || colnames(mytable) == "")
mytable <- mytable[, -bad_cols]

This will first find the positions in which we either have NA or "" in the column, then we exclude it via subsetting and save it in the variable „mytable“ again.

Community
  • 1
  • 1
M M
  • 429
  • 5
  • 13