3

I've got some data in the following shape:

UPDATE: My data has an extra variable I'd like to group by. I used ddply with the below solution provided by Richie but did not work.

Country,group, date
US,A,'2011-10-01'
US,B,'2011-10-01'
US,C,'2011-10-01'
MX,D,'2011-10-01'
UK,E,'2011-10-02'
UK,B,'2011-10-02'
UK,A,'2011-10-02'
UK,C,'2011-10-02'

The data frame is already ordered so A came first, B second and so on so forth. What I am trying to create is a rank variable by date like this:

Country,group, date,rank
US,A,'2011-10-01',1
US,B,'2011-10-01',2
US,C,'2011-10-01',3
MX,D,'2011-10-01',1
UK,E,'2011-10-02',1
UK,B,'2011-10-02',2
UK,A,'2011-10-02',3
UK,C,'2011-10-02',4
    ....
zx8754
  • 52,746
  • 12
  • 114
  • 209
Altons
  • 1,422
  • 3
  • 12
  • 23
  • I am sorry but I disagree - I asked this question in 2011 and got an answer in 2011, the one you're suggesting was answer this year! by oddly enough you @procrastinatus-maximus - bit convenient – Altons Oct 14 '16 at 06:45
  • 1
    It is true that I added an answer this year with the intention to add to the already existing answers which are older than this question. To my surprise the OP changed the accepted answer to mine. It is therefore a valid duplicate imo. – Jaap Oct 14 '16 at 07:31

1 Answers1

4

First, check that your date really is in a date format (not a factor) using class(your_dataset$date). IF not, use ymd from lubridate to convert it.

Second, use rank to get the rank. (Easier than you think, right!)

your_dataset$rank <- rank(your_dataset$date)

There are a few different methods for breaking ties that you might want to explore.

Upon rereading your question, I see you don't want to rank the dates, you want a counter within the dates. To do this, first check that your dataset is ordered by date.

o <- with(your_dataset, order(date))
your_dataset <- your_dataset[o, ]

Then call seq_len on each chunk of date.

counts <- as.numeric(table(your_dataset$date))
your_dataset$rank <- unlist(lapply(counts, seq_len))
Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
  • Thaks für the hint - I deleted my answer. If further clarification is needed Altons will surely comment. – Seb Dec 22 '11 at 14:10
  • This is working, however I formulated my question in the wrong way! See the update. – Altons Dec 22 '11 at 14:32
  • I need to create the rank by 2 variables instead of one as I stated in my question initially. Sorry for the pain – Altons Dec 22 '11 at 14:39
  • Easiest fix is to create a new factor: `within(your_dataset, group <- paste(Country, date))`. Then replace `date` with `group` in my solution above. – Richie Cotton Dec 22 '11 at 15:58