17

After quite a bit of debugging today, to my dismay i found that:

for (i in 1:0) {
     print(i)
}

Actually prints 1 and 0 respectively in R. The problem came up when writing

for (i in 1:nrow(myframe) {
     fn(i)
}

Which i had intended to not execute at all if nrow(myframe)==0. Is the proper correction just:

if (nrow(myvect) != 0) {
    for (i in 1:nrow(myframe) {
        fn(i)
    }
}

Or is there a more proper way to do what I wanted in R?

mt88
  • 2,855
  • 8
  • 24
  • 42

5 Answers5

26

You can use seq_along instead:

vec <- numeric() 
length(vec)
#[1] 0

for(i in seq_along(vec)) print(i)   # doesn't print anything

vec <- 1:5

for(i in seq_along(vec)) print(i)
#[1] 1
#[1] 2
#[1] 3
#[1] 4
#[1] 5

Edit after OP update

df <- data.frame(a = numeric(), b = numeric())
> df
#[1] a b
#<0 rows> (or row.names with length 0)

for(i in seq_len(nrow(df))) print(i)    # doesn't print anything

df <- data.frame(a = 1:3, b = 5:7)

for(i in seq_len(nrow(df))) print(i)
#[1] 1
#[1] 2
#[1] 3
talat
  • 68,970
  • 21
  • 126
  • 157
4

Regarding the edit, see the counterpart function seq_len(NROW(myframe)). This usage is exactly why you don't use 1:N in a for() loop, incase whatever value ends up replacing N is 0 or negative.

An alternative (which just hides the loop) is to do apply(myframe, 1, FUN = foo) where foo is a function containing the things you want to do to each row of myframe and will probably just be cut and paste from the body of the loop.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
4

For vectors there is seq_along, for DataFrames you may use seq_len

for(i in seq_len(nrow(the.table)){
    do.stuff()
}
Boris Gorelik
  • 29,945
  • 39
  • 128
  • 170
3

Clearly all previous answers do the job.

I like to have something like this:

rows_along <- function(df) seq(nrow(df))

and then

for(i in rows_along(df)) # do stuff

Totally idiosyncratic answer, it is just a wrapper. But I think it is more readable/intuitive.

1

I think the most proper way in R is to use an apply function. More often than not, there's an apply function that does that. And more often than not, you don't need a sequence.

Here's an example that applies diff to each column, or each row.

> d <- data.frame(x = 1:5, y = 6:10)

over the columns,

> lapply(d, diff)
$x
[1] 1 1 1 1

$y
[1] 1 1 1 1

across the rows,

> apply(d, 1, diff)
[1] 5 5 5 5 5

over the columns again, returning a matrix

> sapply(d, diff)
     x y
[1,] 1 1
[2,] 1 1
[3,] 1 1
[4,] 1 1

See this link for a most excellent explanation about apply

Community
  • 1
  • 1
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
  • "I think the most proper way in R is to use an apply function" -- with all due respect, I don't think this is good advice. It's OK for there to be two or more ways to do something, but the "wrong way" can't be the most obvious and usually workable way, with the "right way" trailing somewhere behind it; that's just messing with people's heads. For what it's worth. – Robert Dodier Apr 30 '20 at 21:43