4

I'm trying to do in R something apparently very easy (sorry but i'm very newbie with data.tables) but I don't manage to get the right solution. I try to delete the rows with NA values on a specific column("Ground_Tru". This is my attempt so far;

all_data <- fread ("all_vbles.txt",header=TRUE, na.strings=c("NA","N/A",""))
na.omit (all_data, cols="Ground_Tru")

I get the message

Empty data.table (0 rows) of 75 cols: OID_,IN_FID,Polygon_ID,DIST_highw,DIST_railw,DIST_port...

however the "Ground_Tru" field has many NA values thanks in advance for any help,

vizpi
  • 61
  • 1
  • 1
  • 3
  • If you are using `na.omit` if there is any NA in one of the columns, the whole row will be omitted. Please let us know how you want to proceed – akrun Jun 14 '17 at 10:13
  • 1
    `all_data[!is.na(Ground_Tru)]` ? – simone Jun 14 '17 at 10:23
  • @simone, it should be `all_data[!is.na(Ground_Tru),]` to keep all non-NA rows. – parth Jun 14 '17 at 10:30
  • 2
    @ParthChaudhary It is a data.table, it should work without `,` – akrun Jun 14 '17 at 10:32
  • Thanks Akrun, in fact this could be the case since I have some columns woth all NA values. However from the help I read it is possible to specify the column where to check for missing values... – vizpi Jun 14 '17 at 10:35
  • Thanks @simone, in fcat this works, I was just wondering how to do it with na.omit to speed up teh computation... – vizpi Jun 14 '17 at 10:35
  • 1
    Your code should work, as far as I can tell. There are diagnostics we could suggest to figure out why it doesn't for your input data, but really you ought to create a reproducible example before posting, I guess. Some guidance: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 – Frank Jun 14 '17 at 12:34

2 Answers2

2

Use complete.cases:

all_data <- all_data[complete.cases(all_data[, 'Ground_Tru'])]
krassowski
  • 13,598
  • 4
  • 60
  • 92
rankthefirst
  • 1,370
  • 2
  • 14
  • 26
1

At the end I managed to solve the problem. Apparently there are some issues with R reading column names using the data.table library so I followed one of the suggestions provided here: read.table doesn't read in column names

so the code became like this:

library(data.table)

read.table("all_vbles.txt",header=T,nrow=1,sep=",",dec=".",quote="")
all_data <- fread ("all_vbles.txt",header=FALSE, skip=1, ,sep="auto", na.strings=c("NA","N/A","")) 
setnames (all_data,header)
test_data <- na.omit (all_data, "Ground_Tru") 

which seemed to work fine.

vizpi
  • 61
  • 1
  • 1
  • 3