I have an H2O frame R object like this
h2odf
A | B | C | D
--|---|---|---
1 | NA| 2 | 0
2 | 1 | 2 | 0
3 | NA| 2 | 0
4 | 3 | 2 | 0
I want to remove all those rows where B is NA (1st and 3rd row). I have tried
na <- is.na(h2odf[,"b"])
h2odf <- h2odf[!na,]
and
h2odf <- h2odf[!is.na(h2odf$B),]
and
h2odf <- subset(h2odf, B!=NA)
This works for R Dataframe but not H2O. Giving this error:
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page, :
ERROR MESSAGE:
DistributedException from localhost/127.0.0.1:54321: 'Cannot set illegal UUID value'
Desired output is
h2odf
A | B | C | D
--|---|---|---
2 | 1 | 2 | 0
4 | 3 | 2 | 0
One option I have is to convert it into R Dataframe, remove rows and convert it back to H2O frame. But that is taking long time because input file size is close to 4.5 GB. Is it possible to do this in H2O frame hex object itself?
I am running Rstudio on aws cluster.