0

I have a data frame with > 2000 features. Categorical, numerical and logical types.

The constraint for further processing is that the Numerical features should not have any value < 0. However, these are present in the set.

I would like to now a way how to remove all examples (rows) from the underlying data set where at least one numerical feature is negative.

Already tried it this way apply(df, 1, function(x) any(as.numeric(x) <0)) However, this convert my categorical features to NaN.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
λ Allquantor λ
  • 1,071
  • 1
  • 9
  • 22
  • 1
    Please provide a small reproducible example of data (use `dput`) to enable testing of potential solutions: http://stackoverflow.com/a/5963610/1412059 – Roland Apr 07 '16 at 10:10

2 Answers2

0

First find all numerical columns:

df.classes <- lapply(df, class)
df.num     <- c( which(df.classes == "numeric"), which(df.classes == "integer") ) # if you also want to include integer

Then I'd go over df[, df.num] and see if there are any negative values, e.g. with rowSums( any(df[, df.num] < 0) ). Then discard all rows with a value for that > 0.

Good luck!

Jasper
  • 555
  • 2
  • 12
0

Here is how I solved it.

numeric <- df[,sapply(df,class) %in% c('numeric','integer')]
result <- numeric[!apply(numeric,1,function(x) any(x < 0)),]
λ Allquantor λ
  • 1,071
  • 1
  • 9
  • 22