0

I have a little problem with my dataframe in R. This is head of my dataframe.

                         ID              X1 X2 X3     state
1 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     2
2 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     3
3 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     2
4 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     3
5 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     1
6 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     3

And I want to add a new column, which contains numbers of columns X1,X2,X3, but it's depending on a state column. So I want, if my state is 1, I add to the column number from X1 column, If my state is 2, add to the column number from X2, and if my state is 3, add to the column number from X3. So, It looks like this:

                         ID              X1 X2 X3     state age
1 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     2     56
2 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     3     57
3 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     2     55
4 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     3     57
5 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     1     69
6 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     3     70

How can i do that? I've tried ifelse and subset, but it doesn't work :(

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Kalinkin Alexey
  • 175
  • 1
  • 8
  • 2
    Surely this has been asked before. (It's a very simple indexing task.) Have you done any searching on SO or Google? – IRTFM Mar 26 '15 at 21:18
  • 1
    If your data called `data`, do `data$age <- data[cbind(seq_len(nrow(data)), data$state + 1L)]` – David Arenburg Mar 26 '15 at 21:19
  • @BondedDust do you have a quick dupe on your mind in order to close this? – David Arenburg Mar 26 '15 at 21:21
  • I gave it a shot. I thought the Questioner should have done it. I'm going to edit the title so it is more descriptive. You should post your comment as an answer. I'm sure there are worked examples of using two-col matrices for indexing matrices but perhaps not for dataframes yet? – IRTFM Mar 26 '15 at 21:41
  • @BondedDust I couldn't find it neither. Either way, my comment was already included in the below answer. I also don't see your comments without `@` before my name. – David Arenburg Mar 26 '15 at 21:47
  • Here's one for matrices: http://stackoverflow.com/questions/6920441/index-values-from-a-matrix-using-row-col-indicies and one for dataframes: http://stackoverflow.com/questions/10894402/choose-one-cell-per-row-in-data-frame . The most productive search strategy seems to be : `[r] two-column matrix index data.frame` – IRTFM Mar 26 '15 at 22:21

2 Answers2

1

Assuming 'dat' is the name of the dataframe:

dat$age <- dat[ c("X1","X2", "X3" )][ cbind( seq_len(nrow(dat)), dat$state) ]

Using a two column matrix as a single argument to "[" one of the basic R indexing strategy. If the look-up effort were being done with an index into a table of different number of rows or columns with repeats as possibility, then you could have used match or findInterval to build either the row or column vectors. Like @DavidArenburg I find ifelse to be clunky, especially when the number of options may grow.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
0

ifelse works fine in this case. Try something like:

dat <- read.table(text = "
ID              X1 X2 X3     state
1 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     2
2 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     3
3 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     2
4 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     3
5 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     1
6 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     3")

dat$age <- with(dat, ifelse(state == 1, X1, ifelse(state == 2, X2, X3)))

print(dat)
#                                      ID X1 X2 X3 state age
#1 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     2  56
#2 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     3  57
#3 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     2  55
#4 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     3  57
#5 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     1  69
#6 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     3  70

EDIT If you want to use indexing, the following might be an option

dat$age2 <- dat[cbind(1:nrow(dat), dat$state + 1)]
print(dat)
#                                      ID X1 X2 X3 state age age2
#1 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     2  56   56
#2 {026560B0-E0BB-4479-832D-2F5EFFAD9E9F} 56 56 57     3  57   57
#3 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     2  55   55
#4 {04E6B096-A3CC-4C82-9E01-69BB3D6A0CEF} 55 55 57     3  57   57
#5 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     1  69   69
#6 {089E7170-E221-46D9-AE2B-3CD7FB1FEE0B} 69 70 70     3  70   70
Anders Ellern Bilgrau
  • 9,928
  • 1
  • 30
  • 37
  • Hmm... I wouldn't recommend a double `ifelse` when it can be easily solved by indexing. – David Arenburg Mar 26 '15 at 21:24
  • 1
    @DavidArenburg I see no problem with a single nested `ifelse`. And I find the indexing less expressive --- at least the indexing I can come up with. EDIT: Which I see now is the same as you. The `+1` does not nessesarly work something more than the toy dataset. – Anders Ellern Bilgrau Mar 26 '15 at 21:31
  • `ifesle` [has its issues](http://stackoverflow.com/questions/16275149/does-ifelse-really-calculate-both-of-its-vectors-every-time-is-it-slow). A nested one is even worse I guess. – David Arenburg Mar 26 '15 at 21:34
  • Here's a question where the strategy is used for assignment: http://stackoverflow.com/questions/10894402/choose-one-cell-per-row-in-data-frame – IRTFM Mar 26 '15 at 22:20