7

For example, I have a matrix k

> k
  d e
a 1 3
b 2 4

I want to apply a function on k

> apply(k,MARGIN=1,function(p) {p+1})
a b
d 2 3
e 4 5

However, I also want to print the rowname of the row being apply so that I can know which row the function is applied on at that time.

It may looks like this:

apply(k,MARGIN=1,function(p) {print(rowname(p)); p+1})

But I really don't do how to do that in R. Does anyone has any idea?

Hanfei Sun
  • 45,281
  • 39
  • 129
  • 237
  • Can you clarify what answer you're expecting? If you add 1 to every number in the 1st instance of `k`, you don't get the answer in your 2nd instance of `k`. – ChrisW Jun 08 '12 at 22:58
  • There's lots of really messy suggestions here - can you let me know if my proposed solution does what you're seeking? – Tim P Jun 09 '12 at 06:51

4 Answers4

14

Here's a neat solution to what I think you're asking. (I've called the input matrix mat rather than k for clarity - in this example, mat has 2 columns and 10 rows, and the rows are named abc1 through to abc10.)

In the code below, the result out1 is the thing you wanted to calculate (the outcome of the apply command). The result out2 comes out identically to out1 except that it prints out the rownames that it is working on (I put in a delay of 0.3 seconds per row so you can see it really does do this - take this out when you want the code to run full speed obviously!)

The trick I came up with was to cbind the row numbers (1 to n) onto the left of mat (to create a matrix with one additional column), and then use this to refer back to the rownames of mat. Note the line x = y[-1] which means that the actual calculation within the function (here, adding 1) ignores the first column of row numbers, which means it's the same as the calculation done for out1. Whatever sort of calculation you want to perform on the rows can be done this way - just pretend that y never existed, and formulate your desired calculation using x. Hope this helps.

set.seed(1234)
mat = as.matrix(data.frame(x = rpois(10,4), y = rpois(10,4)))
rownames(mat) = paste("abc", 1:nrow(mat), sep="")
out1 = apply(mat,1,function(x) {x+1})
out2 = apply(cbind(seq_len(nrow(mat)),mat),1,
             function(y) {
                           x = y[-1]
                           cat("Doing row:",rownames(mat)[y[1]],"\n")
                           Sys.sleep(0.3)
                           x+1
                          }
            )

identical(out1,out2)
Tim P
  • 1,383
  • 9
  • 19
4

You can use a variable outside of the apply call to keep track of the row index and pass the row names as an extra argument to your function:

idx <- 1
apply(k, 1, function(p, rn) {print(rn[idx]); idx <<- idx + 1; p + 1}, rownames(k))
ALiX
  • 1,021
  • 5
  • 9
  • It works! But it still needs to use a global variable as a counter.. not very elegant.. – Hanfei Sun Jun 09 '12 at 00:40
  • It's easy if you're allowed to use global variables, but my solution above doesn't require these (assuming I've understood the task correctly) :) – Tim P Jun 09 '12 at 06:54
  • This is no different than using a looping variable in a for loop, really. It is no less elegant than that. – ALiX Jun 10 '12 at 01:46
  • Because my input was a subset of a larger dataset, `rownames` did not produce sequential integers. Therefore, I replaced `rownames(k)` with `1:nrow(k)` – Serenthia Apr 14 '15 at 13:53
0

This should work. The cat() function is what you want to use when printing results during evaluation of a function. paste(), conversely, just returns a character vector but doesn't send it to the command window.

The solution below uses a counter created as a closure, allowing it to "remember" how many times the function has been run before. Note the use of the global assign <<-. If you really want to understand what's going on here, I recommend reading through this wiki https://github.com/hadley/devtools/wiki/

Note there may be an easier way to do this; my solution assumes that there is no way to access the rownumber or rowname of a current row using typical means within an apply function. As previously mentioned, this would be no problem in a loop.

k <- matrix(c(1,2,3,4),ncol=2)
rownames(k) <- c("a","b")
colnames(k) <- c("d","e")


make.counter <- function(x){
    i <- 0
    function(){
        i <<- i+1
        i   
    }
}

counter1 <- make.counter()

apply(k,MARGIN=1,function(p){
    current.row <- rownames(k)[counter1()]
    cat(current.row,"\n")
    return(p+1)
})
Michael
  • 5,808
  • 4
  • 30
  • 39
-2

As far as I know you cannot do that with apply, but you could loop through the rownames of your data frame. Lame example:

lapply(rownames(mtcars), function(x) sprintf('The mpg of %s is %s.', x, mtcars[x, 1]))
daroczig
  • 28,004
  • 7
  • 90
  • 124
  • You can do it with `apply` - and not only is looping a really bad idea in general, but the above example doesn't actually print them one by one (put in a Sys.sleep() delay and you'll see they all come out in one burst at the end)... – Tim P Jun 09 '12 at 07:00
  • @TimP - `flush.console()` is handy fo rthings like this – Chase Jun 09 '12 at 20:44
  • As @TimP demonstrates in his answer, this is possible with `apply()`. This accepted answer is misleading. – Zhubarb Nov 06 '13 at 13:52
  • @Berkan - of course there are work-arounds, it would be e.g. a lot simpler to add the `rownames` as a new variable to the `data.frame` and `apply` a function on the top of that instead of adding some mysterious index. What I've written above: `apply` would ditch all `attributes` of rows/columns, and for such tasks (without extending the existing `data.frame`) this answer *might* be the quickest (not in the means of CPU performance) solution. – daroczig Nov 06 '13 at 23:53