0

I would like to add a column to a data frame where the values in the column are based upon the entry order for a specific factor in another column. So specifically for my data I would like to have a "1" for the first visit to a point, a "2" for the second visit, a "3" for the third etc. However, some points have repetitive visits for a given date and should share the same visit number.

The data frame is pre-sorted and looks something like this:

  Transect Point    Date 
 1      BEN     1  5/7/12 
 2      BEN     1 5/10/12 
 3      BEN     1 5/10/12 
 4      BEN     2  5/8/12 
 5      BEN     2 5/11/12
 6      BEN     2 5/13/12

I would like to get something like this:

 Transect Point    Date  Vist
1      BEN     1  5/7/12     1
2      BEN     1 5/10/12     2
3      BEN     1 5/10/12     2
4      BEN     2  5/8/12     1  
5      BEN     2 5/11/12     2
6      BEN     2 5/13/12     3
Arun
  • 116,683
  • 26
  • 284
  • 387
Flammulation
  • 336
  • 2
  • 16
  • very closely associated with [**this one**](http://stackoverflow.com/questions/15280472/in-r-how-do-i-create-consecutive-id-numbers-for-each-repetition-in-a-separate-v/15281528#15281528). I'd say its a duplicate. – Arun Mar 08 '13 at 18:43

1 Answers1

3

Assuming your data.frame is called SODF, use ave:

within(SODF, {
  Visit <- ave(Point, Point, FUN = seq_along)
})
#   Transect Point    Date Visit
# 1      BEN     1  5/7/12     1
# 2      BEN     1 5/10/12     2
# 3      BEN     1 5/13/12     3
# 4      BEN     2  5/8/12     1
# 5      BEN     2 5/11/12     2

If you are grouping by more than one column, for example "Transect" and "Point", change the ave statement to:

ave(Point, Transect, Point, FUN = seq_along)

There are, of course, other approaches, both using base R and using packages. Several of these are summarized and benchmarked by @Arun in his answer here.


Update to address new question requirements

One quick solution that comes to mind considering your new requirement is to first extract the unique cases, perform the index generation as done above, and merge the resulting table with your original table.

SODFunique <- SODF[!duplicated(SODF), ]
SODFunique <- within(SODFunique, {
  Visit <- ave(Point, Transect, Point, FUN = seq_along)
})
merge(SODF, SODFunique, sort = FALSE)
#   Transect Point    Date Visit
# 1      BEN     1  5/7/12     1
# 2      BEN     1 5/10/12     2
# 3      BEN     1 5/10/12     2
# 4      BEN     2  5/8/12     1
# 5      BEN     2 5/11/12     2
# 6      BEN     2 5/13/12     3
Community
  • 1
  • 1
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • I was just noting this as a duplicate referring to that post under comments – Arun Mar 08 '13 at 18:44
  • That worked beautifully, however I realize now another issue I failed to mention in my first post. I do have several repeated visits for a given date. Any thoughts on editing the previously mentioned code to tackle that issue? – Flammulation Mar 08 '13 at 18:50
  • @user2149445, I don't know. How do you expect to tackle the issue. If there is possibly more than one visit per date, then your visits column in this example data would all be 1, not a sequence, as you described. – A5C1D2H2I1M1N2O1R2T1 Mar 08 '13 at 18:53
  • @AnandaMahto You're quite right. The output would not be a true sequence but each date within each point would have a unique visit number. – Flammulation Mar 08 '13 at 19:21
  • @AnandaMahto Thanks for the welcome. I've edited the original question and the example data frame to better address the issue I'm actually having. The initial responses answer my question about halfway. I'm missing the half that deals with my repeated rows (as in the new example data frame). I've read through the linked page [here](http://stackoverflow.com/questions/15280472/in-r-how-do-i-create-consecutive-id-numbers-for-each-repetition-in-a-separate-v/15281528#15281528) but still come up short in completing my question. – Flammulation Mar 08 '13 at 19:49
  • @user2149445, see my update for one alternative. – A5C1D2H2I1M1N2O1R2T1 Mar 08 '13 at 20:05