3

Let say, I have a table named table with headers which looks like

A B
1 2
2 1

What I want to do is adding a new column table$C such that

  1. if table$A < table$B then table$C <- (table$B-table$A)/table$A
  2. if table$A > table$B then table$C <- (table$B-table$A)/table$B

for each row so that the resulting table would look like

A B C
1 2 1
2 1 -1

I tried, quite naively,
> table$C <- if (table$A < table$B) (table$B-table$A)/table$A else (table$B-table$A)/\table$B
and
> table$C <- ifelse(table$A < table$B, (table$B-table$A)/table$A, (table$B-table$A)/table$B)
but both of them didn't work. How do I do this one?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
HBS
  • 243
  • 3
  • 9
  • Do you really have a backslash in front of the last `table$B`? – seancarmody Aug 31 '12 at 23:35
  • I removed the backslash. It's just a bad habit from using too much LaTeX. :-) – HBS Aug 31 '12 at 23:38
  • The first call assigns C twice - probably a bad idea. The second solution looks OK to me. – Alex Brown Aug 31 '12 at 23:43
  • 1
    Well, the second one was the correct one as Alex said. The problem turned out to be just a simple typo. :-( RSK's answer also is quite helpful by making me to think `out of the box' since I was obsessed with if/else statement. Thanks guys! – HBS Aug 31 '12 at 23:52
  • I thought I saw a `with` answer before...that's how I would have done it. Was it deleted? – seancarmody Sep 01 '12 at 01:53

3 Answers3

4

As usual there are a lot of ways to accomplish this. Assuming your rule stays the same and you never have a case of division by zero here are a few ideas...

df <- data.frame(A = c(1, 2), B = c(2, 1))

# Making use of pmin.
df$c <- (df$B - df$A) / pmin(df$A, df$B)

# Making use of 'with' ( See ?with )
df$C <- with(df, (B - A)/pmin(A, B))

# Making use of data.table.
library(data.table, quiet = TRUE)

## data.table 1.8.2 For help type: help("data.table")

dt <- data.table(df)

dt[, `:=`(C, (B - A)/pmin(A, B))]

##    A B  C
## 1: 1 2  1
## 2: 2 1 -1
Thell
  • 5,883
  • 31
  • 55
  • +1 I've noticed people using functional `:=`() recently and wondered why, though. Isn't `dt[,C:=(B-A)/min(A,b)]` slightly easier to read? – Matt Dowle Sep 04 '12 at 18:53
  • And do the `min`s need to be `pmin`s? – Matt Dowle Sep 04 '12 at 18:58
  • @MatthewDowle I'm trying to train my mind to use the `\`:=\(`` format for usage with multiple variables. :) And, yes, min should be pmin thank you! – Thell Sep 04 '12 at 22:59
  • 1
    But `':='()` isn't needed for multiple variables is it? `DT[,c("new1","new2"):=1,with=FALSE]` adds two new columns, and `DT[,LETTERS:=2,with=FALSE]` adds 26 (unless any `LETTERS` already exist, in which case those columns are updated). Btw, there's a FR to drop needing `with=FALSE` in the first case (LHS is a `call`), but `with=FALSE` will always be needed in 2nd case (to say the column shouldn't be called `"LETTERS"`). – Matt Dowle Sep 04 '12 at 23:47
  • 1
    Well, how do you like that. Don't know where I got the idea that I couldn't do `dt[,c('a','b'):=list(rev(A), rev(B)), with=FALSE]` but it sure enough works. Woot! – Thell Sep 05 '12 at 00:21
2

Your second approach (without the original typo) is correct, as you note, but I think this is quicker, easier to read and less error-prone:

table$C <- with(table, ifelse(A < B, (B - A)/A, (B - A)/B))
seancarmody
  • 6,182
  • 2
  • 34
  • 31
  • 1
    `within` might be even better. – Matt Dowle Sep 04 '12 at 19:06
  • Do you mean something like this: `table <- within(table, C <- ifelse(A < B, (B - A)/A, (B - A)/B))`? By my reckoning, it's marginally longer in this instance. Not sure about efficiency. – seancarmody Sep 05 '12 at 09:57
  • Oh, good point. `within` doesn't help much actually does it. On efficiency, `within` and `$C <-` copy the entire object, at least once, so I've moved over to `:=` for all tasks like this. – Matt Dowle Sep 05 '12 at 11:21
  • 1
    Yes. I'm not suggesting data.table, btw. I was just explaining why my suggestion of `within` turned out to be poor (i.e. because I'm not as familiar with it). I was just trying to suggest something that _wasn't_ data.table for a change! – Matt Dowle Sep 05 '12 at 14:45
  • I wouldn't worry about that: there seen to be plenty of data tables affcionados around here! – seancarmody Sep 06 '12 at 10:33
0

Use logical indexing:

table$C <- (table$B-table$A)/((table$A<=table$B)*table$A+(table$A>table$B)*table$B)
Superbest
  • 25,318
  • 14
  • 62
  • 134
rezakhorshidi
  • 539
  • 1
  • 4
  • 10
  • This repeats the variable `table` 9 times in one line. See [here](http://stackoverflow.com/a/10758086/403310) for how variable name repetition can bite. – Matt Dowle Sep 04 '12 at 19:05