dynamic filtering rows in R

Question

I have a data frame with > 2000 features. Categorical, numerical and logical types.

The constraint for further processing is that the Numerical features should not have any value < 0. However, these are present in the set.

I would like to now a way how to remove all examples (rows) from the underlying data set where at least one numerical feature is negative.

Already tried it this way apply(df, 1, function(x) any(as.numeric(x) <0)) However, this convert my categorical features to NaN.

Please provide a small reproducible example of data (use `dput`) to enable testing of potential solutions: http://stackoverflow.com/a/5963610/1412059 — Roland, Apr 07 '16 at 10:10

score 0 · Accepted Answer · answered Apr 07 '16 at 10:16

0

First find all numerical columns:

df.classes <- lapply(df, class)
df.num     <- c( which(df.classes == "numeric"), which(df.classes == "integer") ) # if you also want to include integer

Then I'd go over df[, df.num] and see if there are any negative values, e.g. with rowSums( any(df[, df.num] < 0) ). Then discard all rows with a value for that > 0.

Good luck!

answered Apr 07 '16 at 10:16

Jasper

555
2
12

You are absolutely right! Well first split the data by types solve it. This works for me. – λ Allquantor λ Apr 07 '16 at 10:23

λ Allquantor λ · Answer 2 · 2016-04-07T10:51:24.350

0

Here is how I solved it.

numeric <- df[,sapply(df,class) %in% c('numeric','integer')]
result <- numeric[!apply(numeric,1,function(x) any(x < 0)),]

edited Apr 07 '16 at 10:51

answered Apr 07 '16 at 10:26

λ Allquantor λ

1,071
1
9
22

dynamic filtering rows in R

2 Answers2