Conditional if statement in R

Question

I would like to know how I can populate a column in a data.frame or data.table based on the value of other columns.

for example:

data.table(a = c(1:5), b = c(5:1), c = rep("",5))
 a b c
 1 5  
 2 4  
 3 3  
 4 2  
 5 1

I want to populate c to:

indicate "More" if "a" is greater than "b"
"Less" if "a" is less than "b"
"Equal if they are equal

as below:

a b c
1 5 "Less" 
2 4 "Less" 
3 3 "Equal"
4 2 "More"
5 1 "More"

I know that this can be done through a for loop with multiple if statements, but I have a very large data set and I would like to do this using the "Apply" family of functions.

Any help would be greatly appreciated.

Possible duplicate of [Nested ifelse statement in R](http://stackoverflow.com/questions/18012222/nested-ifelse-statement-in-r) — shrgm, May 16 '16 at 05:35
Using `apply` would again invite for nested `ifelse`. Not sure if there is another way to use `apply` . `apply(df, 1, function(x) ifelse(x[1] > x[2], "More", ifelse(x[1] < x[2], "Less", "Equal")))` — Ronak Shah, May 16 '16 at 05:37
The `apply` option mentioned above is not needed as it is a data.table. Also, looping through rows and using `ifelse` is highly inefficient. I already posted some efficient methods on my solution. — akrun, May 16 '16 at 05:40
exactly. Just wanted to clarify why OP insisted on using `apply` — Ronak Shah, May 16 '16 at 05:44

thelatemail · Answer 1 · 2016-05-17T03:21:02.957

Here's one that seems bizarrely efficient by my testing, and is not overly complex:

dt1[, c := c("less","equal","more")[max.col(setDT(.(a < b, a==b, a > b)))] ]

Another option using interaction which is about as good as I can do in simplification of code:

dt1[, c := c("equal","less","more")[interaction(a < b, a > b)] ]

#   a b     c
#1: 1 5  less
#2: 2 4  less
#3: 3 3 equal
#4: 4 2  more
#5: 5 1  more

This works because interaction will return four possibilities for two comparisons:

FALSE.FALSE = neither TRUE, which in this case means not larger or smaller, therefore equal
TRUE.FALSE = first TRUE, therefore less
FALSE.TRUE = second TRUE, therefore more
TRUE.TRUE = both TRUE, theoretically impossible except for fringe cases below the error tolerance.

akrun · Accepted Answer · 2016-05-16T05:47:39.070

3

We can use a numeric method to create unique values based on the logical vectors (a > b and a==b), convert to factor, get the numeric storage values by converting to numeric, replace that with the elements in the vector (c("Less", "More", "Equal")) and assign (:=) it as 'c'.

dt1[, c :=c("Less", "More", "Equal")[as.numeric(factor(1+2*(a>b) + 4*(a==b)))]]
dt1
#   a b     c
#1: 1 5  Less
#2: 2 4  Less
#3: 3 3 Equal
#4: 4 2  More
#5: 5 1  More

Or we can use a nested ifelse to get the expected output.

dt1[, c:= ifelse(a>b, "More", ifelse(a<b, "Less", "Equal"))]

Or another option would be using the Reduce/max.col/pmax to get a numeric index and replace it with the character vector as showed in the first method.

dt1[,  c:= c("Equal", "Less", "More")[pmax(max.col(.SD),
                         3*Reduce(`==`, .SD))], .SDcols = a:b]

data

dt1 <-  data.table(a = c(1:5), b = c(5:1), c = rep("",5))

edited May 16 '16 at 05:47

answered May 16 '16 at 05:25

akrun

874,273
37
540
662

Both methods worked like a charm in the given example. Though I think I prefer the second method to apply on my specific case as I find the first method a bit hard to understand. – Ayelavan May 16 '16 at 05:52
1

can the first option be simplified to `dt1[, c := c("less","more","equal")[ (1 + (a >= b) + (a == b)) ] ]`? – SymbolixAU May 16 '16 at 05:56
@Symbolix Yes, It can be done, I thought by multiplying, we get the unique values. – akrun May 16 '16 at 06:25

Conditional if statement in R

2 Answers2

data