-2

I try to subset values in R depending on values in column y like shown in the following:

I have the data set "data" which is like this:

data <- data.frame(y = c(0,0,2000,1500,20,77,88),
                   a = "bla", b = "bla")

And would end up with this:

I have this R code:

data <- arrange(subset(data, y != 0 & y < 1000 & y !=77 & [...]), desc(y))
print(head(data, n =100))

Which works.

However I would like to collect the values to exclude in a list as:

[0, 1000, 77]

And somehow loop through this, with the lowest possible running time instead of hardcoding them directly in the formula. Any ideas?

The list, should only contain "!=" operations:

[0, 77]

and the "<" should be remain in the formula or in another list.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
User123456789
  • 149
  • 10
  • 1
    Please provide an [MRE](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), especially include some example data by pasting the output of `dput`. Usually, using vectorised approaches to manipulate your data such as `dplyr` or `data.table` are faster than `for` loops. If you want a certain print output, you could write your own `print` functino – starja Dec 23 '20 at 09:55
  • 1
    How do you know which operators to apply to your vector of numbers? For example if the values are `c(0,1000,77)` and the operators are `c("!=","<","!=")`, where are you keeping track of the operators? – Ian Campbell Dec 23 '20 at 14:25
  • Thank you for your comment Ian. Actually there is only one operation with "<", and a lot with "!=", I will keep the "<" in the equation and not include 1000 in the list. But still needs a fancy way handling all "!=" operations. – User123456789 Dec 23 '20 at 14:36

1 Answers1

1

I'm going to answer your original question because it's more interesting. I hope you won't mind.

Imagine you had values and operators to apply to your data:

my.operators <- c("!=","<","!=")
my.values <- c(0,1000,77)

You can use Map from base R to apply a function to two vectors. Here I'll use get so we can obtain the actual operator given by the character string.

Map(function(x,y)get(y)(data$y,x),my.values,my.operators)
[[1]]
[1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

[[2]]
[1]  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE

[[3]]
[1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE

As you can see, we get a list of logical vectors for each value, operator pair.

To better understand what's going on here, consider only the first value of each vector:

get("!=")(data$y,0)
[1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

Now we can use Reduce:

Reduce(`&`,lapply(my.values,function(x) data$y!=x))
[1] FALSE FALSE  TRUE  TRUE  TRUE FALSE  TRUE

And finally subset the data:

data[Reduce("&",Map(function(x,y)get(y)(data$y,x),my.values,my.operators)),]
   y   a   b
5 20 bla bla
7 88 bla bla
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57