create month column by summing over another column in data.frame

Question

In R, I am trying to create a month column to plot my data with by summing over another column that has the same value for each population I am working with, ex:

NAME ORIG_ROW MONTH
POP1 1        1
POP1 1        2
POP1 1        3
POP2 2        1
POP2 2        2
POP2 2        3

I am able to do this with:

df$MONTH <- sapply(1:nrow(df), function(i) (colSums(df[0:i, c('ORIG_ROW') == df$ORIG_ROW[i]))

However, this code is inefficient when I try to apply it to a large dataset (~825k observations).

Does anyone have suggestions on how to make this code more efficient?

Take a look at [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). It's going to be hard for people to answer this question without a minimal example of your starting data and expected results. — divibisan, Aug 06 '18 at 17:27
R is one-based, `0:i` becomes `1:i`. Also, do you want `MONTH` to have consecutive values, `1:n ` where `n` is the number of rows of each group of `ORIG_ROW`? — Rui Barradas, Aug 06 '18 at 17:33
@RuiBarradas yep, I noticed that it worked the same with 0:i and 1:i, I will adjust in my code since R is one based. And yes, I would like 'MONTH' to have consecutive values as you say. — K. Jean, Aug 06 '18 at 17:44
Right now your "able to do with" code doesn't run - you're missing a `]` and a `)`. — DanY, Aug 06 '18 at 17:47

score 1 · Accepted Answer · answered Aug 06 '18 at 17:56

1

What you want can be done with a simple call to ave, grouping a column by itself.

df$MONTH <- with(df, ave(ORIG_ROW, ORIG_ROW, FUN = seq_along))

DATA.

df <-
structure(list(NAME = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("POP1", 
"POP2"), class = "factor"), ORIG_ROW = c(1L, 1L, 1L, 2L, 2L, 
2L)), row.names = c(NA, -6L), class = "data.frame")

answered Aug 06 '18 at 17:56

Rui Barradas

70,273
8
34
66

this worked, thanks so much! apologies for not providing the data.frame to start. :) – K. Jean Aug 06 '18 at 18:02

create month column by summing over another column in data.frame

1 Answers1