4

I've just started learning R for a short time. I have the following table

Name      stDte      edDte  
A    2010-05-01 2014-12-01  
B    2013-06-01 2014-02-01  

I need to get it transformed into a table like this

Name   Dte
A      2010-05-01
A      2010-06-01
A      2010-07-01
...
A      2014-12-01
B      2013-06-01
B      2013-07-01
...
B      2014-02-01

I'm thinking about using a combination of a 'for' loop for together with rbind but I'm not sure how to go about it. Would appreciate any suggestions on how to do that. Thanks in advance for the guidance

Keong
  • 43
  • 1
  • 4
  • Welcome to SO. Have you tried anything? – John Powell Jan 24 '15 at 10:13
  • Yes, Richard Scriven's data.table approach does exactly what I need. I hadn't thought of using seq(stDte, edDte, by='month'), by='Name'] to add new rows to the table as I had only used it to aggregate rows previously. This is a great forum with super fast responses. I'm going to learn a lot here. – Keong Jan 24 '15 at 11:43

3 Answers3

10

Since you don't state otherwise, this answer assumes the stDte and edDte columns are both of "Date" class.

In base R you could use Map() to create the sequence of dates, then data.frame to bring the new data frame together after creating the new Name column with rep.int().

M <- Map(seq, df$stDte, df$edDte, by = "month")
df2 <- data.frame(
    Name = rep.int(df$Name, vapply(M, length, 1L)), 
    Dte = do.call(c, M)
)    
str(df2)
# 'data.frame':    65 obs. of  2 variables:
#  $ Name: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
#  $ Dte : Date, format: "2010-05-01" "2010-06-01" ...
head(df2, 3)
#   Name        Dte
# 1    A 2010-05-01
# 2    A 2010-06-01
# 3    A 2010-07-01
tail(df2, 3)
#    Name        Dte
# 63    B 2013-12-01
# 64    B 2014-01-01
# 65    B 2014-02-01

Or you can use the data.table package and do

library(data.table)
setDT(df)[, .(Dte = seq(stDte, edDte, by = "month")), by = Name]
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
4

You cn build a series of dataframes for each row and then rbind them together. The argument recycling property of the dataframe function will repeat the value of 'Names' as many times as necessary:

do.call(rbind, 
        lapply(seq(nrow(dat)), function(x){
            data.frame(Name=dat[x,"Name"], 
            Dte=seq(as.Date(dat[x,"stDte"]), 
                    as.Date(dat[x,"edDte"]) ,by="month") ) } ))
IRTFM
  • 258,963
  • 21
  • 364
  • 487
3
library(plyr)
ddply(df, .(Name), summarise, Dte = seq(as.Date(stDte), as.Date(edDte), by = "month"))
   Name        Dte
1     A 2010-05-01
2     A 2010-06-01
3     A 2010-07-01
4     A 2010-08-01
5     A 2010-09-01
6     A 2010-10-01
7     A 2010-11-01
8     A 2010-12-01
9     A 2011-01-01
10    A 2011-02-01
11    A 2011-03-01
12    A 2011-04-01
13    A 2011-05-01
14    A 2011-06-01
15    A 2011-07-01
16    A 2011-08-01
17    A 2011-09-01
18    A 2011-10-01
19    A 2011-11-01
20    A 2011-12-01
21    A 2012-01-01
22    A 2012-02-01
23    A 2012-03-01
24    A 2012-04-01
25    A 2012-05-01
26    A 2012-06-01
27    A 2012-07-01
28    A 2012-08-01
29    A 2012-09-01
30    A 2012-10-01
31    A 2012-11-01
32    A 2012-12-01
33    A 2013-01-01
34    A 2013-02-01
35    A 2013-03-01
36    A 2013-04-01
37    A 2013-05-01
38    A 2013-06-01
39    A 2013-07-01
40    A 2013-08-01
41    A 2013-09-01
42    A 2013-10-01
43    A 2013-11-01
44    A 2013-12-01
45    A 2014-01-01
46    A 2014-02-01
47    A 2014-03-01
48    A 2014-04-01
49    A 2014-05-01
50    A 2014-06-01
51    A 2014-07-01
52    A 2014-08-01
53    A 2014-09-01
54    A 2014-10-01
55    A 2014-11-01
56    A 2014-12-01
57    B 2013-06-01
58    B 2013-07-01
59    B 2013-08-01
60    B 2013-09-01
61    B 2013-10-01
62    B 2013-11-01
63    B 2013-12-01
64    B 2014-01-01
65    B 2014-02-01
DatamineR
  • 10,428
  • 3
  • 25
  • 45