Remove NA in a data.table in R

Question

I'm trying to do in R something apparently very easy (sorry but i'm very newbie with data.tables) but I don't manage to get the right solution. I try to delete the rows with NA values on a specific column("Ground_Tru". This is my attempt so far;

all_data <- fread ("all_vbles.txt",header=TRUE, na.strings=c("NA","N/A",""))
na.omit (all_data, cols="Ground_Tru")

I get the message

Empty data.table (0 rows) of 75 cols: OID_,IN_FID,Polygon_ID,DIST_highw,DIST_railw,DIST_port...

however the "Ground_Tru" field has many NA values thanks in advance for any help,

If you are using `na.omit` if there is any NA in one of the columns, the whole row will be omitted. Please let us know how you want to proceed — akrun, Jun 14 '17 at 10:13
@simone, it should be `all_data[!is.na(Ground_Tru),]` to keep all non-NA rows. — parth, Jun 14 '17 at 10:30
@ParthChaudhary It is a data.table, it should work without `,` — akrun, Jun 14 '17 at 10:32
Thanks Akrun, in fact this could be the case since I have some columns woth all NA values. However from the help I read it is possible to specify the column where to check for missing values... — vizpi, Jun 14 '17 at 10:35
Thanks @simone, in fcat this works, I was just wondering how to do it with na.omit to speed up teh computation... — vizpi, Jun 14 '17 at 10:35
Your code should work, as far as I can tell. There are diagnostics we could suggest to figure out why it doesn't for your input data, but really you ought to create a reproducible example before posting, I guess. Some guidance: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 — Frank, Jun 14 '17 at 12:34

score 2 · Answer 1 · edited Dec 01 '17 at 05:24

2

Use complete.cases:

all_data <- all_data[complete.cases(all_data[, 'Ground_Tru'])]

edited Dec 01 '17 at 05:24

krassowski

13,598
4
60
92

answered Nov 01 '17 at 06:50

rankthefirst

1,370
2
14
26

score 1 · Answer 2 · answered Jun 15 '17 at 09:13

At the end I managed to solve the problem. Apparently there are some issues with R reading column names using the data.table library so I followed one of the suggestions provided here: read.table doesn't read in column names

so the code became like this:

library(data.table)

read.table("all_vbles.txt",header=T,nrow=1,sep=",",dec=".",quote="")
all_data <- fread ("all_vbles.txt",header=FALSE, skip=1, ,sep="auto", na.strings=c("NA","N/A","")) 
setnames (all_data,header)
test_data <- na.omit (all_data, "Ground_Tru")

which seemed to work fine.

Remove NA in a data.table in R

2 Answers2