-4

I have data set with columns - Reporting date, Flag 1 and 0 and account number.

So for one account we have multiple reporting date starting from Jan 2001 until Dec 2008.

Flag has value 1 and 0.

My requirement is to identify the month where flag becomes and 1 and fetch last 12 months data.

For example if flag becomes 1 in Dec 2008, then we should get the data for that account starting from Jan 2008 till Dec 2008.

I tried reshape and managed to pull the default date but now I need to fetch only last 12 months data.

I'm trying to do this in R. Any suggestion please.

zx8754
  • 52,746
  • 12
  • 114
  • 209
VA25
  • 1
  • 3
  • Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Jul 27 '18 at 06:05

1 Answers1

0

You can apply the condition you mentioned to create a list of data.frame then handle them accordingly. See below:

library(lubridate)

# data simulation
set.seed(123)
dates <- seq(dmy("01-01-2001"), to = dmy("31-12-2008"), by = 1)
df <- data.frame(
  rep_date = sample(dates, 5000, replace = TRUE),
  flag = sample(c(FALSE, TRUE), 5000, replace = TRUE),
  acc_num = sample(100:999, 5000, replace = TRUE)
)

# simulated data
head(df)
#     rep_date  flag acc_num
# 1 2006-02-04  TRUE     553
# 2 2001-09-27 FALSE     896
# 3 2007-05-05 FALSE     946
# 4 2008-07-16 FALSE     823
# 5 2003-06-04  TRUE     694
# 6 2005-07-04 FALSE     395


# list of data.frame
df_list <- apply(df[df$flag, ], 1, function(x) {
  r_date <- as_date(x[1])
  r_acc_num <- as.integer(x[3])
  df[df$rep_date <= r_date & df$rep_date >= (r_date - 365) & df$acc_num == r_acc_num, ]
})        

# result for Case 1000
df_list[[1000]]
#       rep_date   flag   acc_num
# 2010 2005-01-24  TRUE     199
# 3229 2004-12-18 FALSE     199
# 4790 2004-08-06  TRUE     199
Artem
  • 3,304
  • 3
  • 18
  • 41