0

I'm working on a replication of the study for this particular data that you could find in this link, the data is named AProrok_AJPS.tab, please click on Download and then you can choose the RData format.

I want to remove all the rows whose value in a specific column is 1, so with this code:

df <- data[data$unknownleader!=1,]

After that, however, all the data becomes NA, it becomes all blank basically. I tried to change the type of data between integer, factor, class, etc. but all resulted into the same problem. I am not sure what is with this data file that causes this problem. Could anyone please investigate and show me a possible way to fix it?

Gerry
  • 176
  • 1
  • 11
  • 1
    @ErdemAkkas Hi thanks but it's != for different from, the double equal signs == is for equality. – Gerry May 17 '17 at 11:34
  • After loading the data, it seems that all the values for `unknownleader != 1` are in fact `NA`. So that R gives this answer is to be suspected. – Paul Hiemstra May 17 '17 at 11:40
  • @PaulHiemstra Yes, but after running that code, all information in all other entries becomes NA, everything. Should I replace the NA in that `unknownleader` column by 0's first? – Gerry May 17 '17 at 11:44
  • 1
    Indexing with NA probably causes your issues here. See this post for some pointers: http://stackoverflow.com/questions/16822426/r-dealing-with-true-false-na-and-nan. – Paul Hiemstra May 17 '17 at 12:01
  • Try `df <- data[which(data$unknownleader != 1), ]`. – ikop May 18 '17 at 05:10

1 Answers1

0

Ok so thanks to @PaulHiemstra for pointing out that the problem arose from the NA in the dataset. Then, based on this thread, I could come up with a solution:

First replacing all the NA in that particular unknownleader column to 0:

df$unknownleader <- replace(df$unknownleader, is.na(df$unknownleader), 0)

Then proceed to remove the rows as mentioned in the question as normally:

df <- df[df$unknownleader==0, ]

Note that since the unknownleader variable happens to be binomial, therefore it still makes sense to replace NA to 0. For other dataset some appropriate adjustments might be needed.

Community
  • 1
  • 1
Gerry
  • 176
  • 1
  • 11