1

I need to create a new column in current data frame from my original whose format given below.

Reg         Canceled
20161107    20171102
20120607    20131216 
20110105    20130524

The new column will be titled Tenure, and it should hold the difference in months between the date in the Canceled column and the one in the Reg column.

Reg          Canceled      Tenure
20161107    20171102         12
20120607    20130607         12
20110105    20110505          4
Claus Wilke
  • 16,992
  • 7
  • 53
  • 104
Green
  • 695
  • 2
  • 9
  • 21
  • Are your "dates" stored as integers, characters or factors? And I assume you are just trying to calculate the number of months between Reg and Canceled where those values are in some sort of YYYYMMDD format? – MrFlick Dec 12 '17 at 21:35
  • @MrFlick Yes, the dates are stored as integers and I am trying to calculate the number of months between Reg and Canceled where those values are formatted as YYYYMMDD. Essentially, I am adding a new column named tenure. – Green Dec 12 '17 at 22:01

2 Answers2

3

Borrowing from Number of months between two dates

df <- read.table(text='
Reg          Canceled      Tenure
20161107    20171102         12
                 20120607    20130607         12
                 20110105    20110505          4', header = T)
df$Tenure 

elapsed_months <- function(end_date, start_date) {
  ed <- as.POSIXlt(end_date)
  sd <- as.POSIXlt(start_date)
  12 * (ed$year - sd$year) + (ed$mon - sd$mon)
} 

df$Tenure2 <- elapsed_months(strptime(df$Canceled,'%Y%m%d'),strptime(df$Reg,'%Y%m%d'))
mr.joshuagordon
  • 754
  • 4
  • 8
0

I suggest

Reg <- c(       
"20161107",    
"20120607",    
"20110105"  
)
Canceled <- c(
  "20171102",
  "20131216" ,
  "20130524"
)
df <- data.frame(Reg=Reg, Canceled=Canceled)
parsed_reg <- gsub("([0-9]{4})([0-9]{2})([0-9]{2})", "\\1-\\2-\\3", df$Reg)
split_reg <- strsplit(parsed_reg, "-")
dates_reg <- t(data.frame(split_reg))
dates_reg <- data.frame(apply(dates_reg, 2, as.integer))
parsed_can <- gsub("([0-9]{4})([0-9]{2})([0-9]{2})", "\\1-\\2-\\3", df$Canceled)
split_can <- strsplit(parsed_can, "-")
dates_can <- t(data.frame(split_can))
dates_can <- data.frame(apply(dates_can, 2, as.integer))
colnames(dates_reg) <- c("year", "month", "day")
colnames(dates_can) <- c("year", "month", "day")

df$tenure <- (dates_can$year-dates_reg$year)*12 + (dates_can$month-dates_reg$month)
ge.org
  • 69
  • 3