0

Let's say I write a function

group=function(x){if (x<=8) {o=1} 
else if (x<=11) {o=2} 
else o=3; 
return(o)}

and have a matrix

 test=matrix(1:25,nrow=5); test
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    6   11   16   21
[2,]    2    7   12   17   22
[3,]    3    8   13   18   23
[4,]    4    9   14   19   24
[5,]    5   10   15   20   25

Now I want to add 3 columns (columns 6, 7 and 8) to the matrix. Column 6 and 7 are the value of the function group of column 2 and 3, and column 8 is used to mark the change of the group. That is to say, I want to get:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    1    6   11   16   21    1    2   12 #6<8 so column 6 is 1, 11<=11 so column 7 is 2; it changes from group 1 to 2, so column 8 is 12
[2,]    2    7   12   17   22    1    3   13
[3,]    3    8   13   18   23    1    3   13
[4,]    4    9   14   19   24    2    3   23
[5,]    5   10   15   20   25    2    3   23

I tried to use

test2=cbind(test,group(test[,2:3]))

but it said the following and didn't work

Warning messages:
1: In if (x == 0) { :
  the condition has length > 1 and only the first element will be used

I also tried to use the following, but didn't work either.

test2=cbind(test, apply(test[,2:3],1,group))

So which function should I use? Also, for column 8, do we have such a function? Thank you!

P.S. I was trying to compute a Markov transition matrix using R, not sure if my approach is the most compact way...

Natalia
  • 369
  • 3
  • 15
  • 2
    you should use `cut` instead `tt <- test[, 2:3]; tt[] <- as.numeric(cut(tt, breaks = c(0, 8, 11, Inf))); cbind(tt, as.numeric(apply(tt, 1, paste0, collapse = '')))` – rawr Dec 02 '15 at 04:00
  • 2
    @rawr - or all in one crazy line `cbind(test, \`[<-\`(test[,2:3], cut(test[,2:3], c(-Inf,8,11,Inf), labels=1:3)) )` – thelatemail Dec 02 '15 at 04:01
  • if there was ever a crazy one, it would be @thelatemail – rawr Dec 02 '15 at 04:02
  • Thank you @rawr . I didn't know `cut` before! ;) – Natalia Dec 02 '15 at 04:15
  • @thelatemail , what does `Inf` mean? And what is ` `[<-` ` ? It seems so compact that I can't understand... What if I want more intervals (more cuts)? Thank you. – Natalia Dec 02 '15 at 04:17
  • 1
    @Natalia - `Inf` is "Infinity" - see `?Inf` - it just means anything below 8 to negative infinity goes into the first group, and anything above 11 to infinity goes into the last group. The `\`[<-\`` is too much to explain in detail here, but it's a replacement function, like doing `test[,2:3] <- ...` but the result is returned - see http://stackoverflow.com/questions/11563154/what-are-replacement-functions-in-r – thelatemail Dec 02 '15 at 04:20
  • @rawr , `tt[] <- as.numeric(cut(tt, breaks = c(0, 8, 11, Inf)))` it returned one `NA` for tt[1,1]. Any idea why? – Natalia Dec 02 '15 at 04:28
  • @thelatemail , when I tried to use your code in my real data set (imported from csv), It said `Error in `[<-.data.frame`(data[, 2:3], cut(data[, 2:3], c(-Inf, 0, 5, : need 0, 1, or 2 subscripts`. What's wrong? – Natalia Dec 02 '15 at 04:49
  • @Natalia - you're using a `data.frame` not a `matrix` - the `cut` code fails because of this, which in turn causes the `\`[<-\`` code to fail. – thelatemail Dec 02 '15 at 04:55
  • @thelatemail got you. so I can only use `cut` in a matrix. – Natalia Dec 02 '15 at 04:59
  • @Natalia you can use it on a data frame but in a different way. you should think of data frames as an object comprised of smaller objects (each column is its own list) so therefore you need to act on each object. a matrix is basically a single vector with two dimensions instead of one (like a vector). if you're working with a data frame, something like `tt[, 2:3] <- lapply(tt[, 2:3], function(x) as.numeric(cut(x, c(0, 8, ...))))` could work – rawr Dec 02 '15 at 13:14

1 Answers1

3

As suggested by @Christoph, you can use nested ifelse

cbind(test, ifelse(test[, 2:3] <= 8, 1,ifelse(test[, 2:3] <= 11, 2, 3)))

#      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,]    1    6   11   16   21    1    2
#[2,]    2    7   12   17   22    1    3
#[3,]    3    8   13   18   23    1    3
#[4,]    4    9   14   19   24    2    3
#[5,]    5   10   15   20   25    2    3

EDIT

As per the edited question, the new column can be achieved by

mat <- cbind(test, ifelse(test[, 2:3] <= 8, 1,ifelse(test[, 2:3] <= 11, 2, 3)))
cbind(mat, as.numeric(paste(mat[, 6], mat[, 7], sep="")))

#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#[1,]    1    6   11   16   21    1    2   12
#[2,]    2    7   12   17   22    1    3   13
#[3,]    3    8   13   18   23    1    3   13
#[4,]    4    9   14   19   24    2    3   23
#[5,]    5   10   15   20   25    2    3   23
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213