1

R learner here. I'm trying to use R to compare dates and apply a value based on when the date falls between.

For example, if Acq_Dt is prior to 10-1-1984, Cap_Threshold should be 1000, if it's between October 1 1984 and September 30, 1991, it should be 5000, etc. However, the expression is not working; everything is being evaluated to 5000. Any help would be appreciated.

temp1$Acq_Dt <-as.Date(temp1$Acq_Dt,format="%m/%d/%Y") 
temp1$CapThreshold <- if (temp1$Acq_Dt < "1984-10-01") {         
   1000
  } else if (temp1$Acq_Dt >= "1984-10-01" & temp1$Acq_Dt <= "1991-09-30")  {
   5000
  } else if (temp1$Acq_Dt >= "1991-10-01" & temp1$Acq_Dt <= "1993-09-30") {
   15000
  } else if (temp1$Acq_Dt >= "1993-10-01" & temp1$Acq_Dt <= "1994-09-30")  {
    25000
  } else if (temp1$Acq_Dt >= "1994-10-01" & temp1$Acq_Dt <= "1995-09-30") {
    50000
  } else if (temp1$Acq_Dt >= "1995-10-01" & temp1$Acq_Dt <= "2013-09-30") {
    100000
  } else if (temp1$Acq_Dt >= "2013-10-01") {
    1000000
} else { 
  0
}
BB123
  • 33
  • 1
  • 6
  • 1
    What's actually in your `temp1` data.frame. Do your dates start out as character values or factor values? When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input that can be used to test and verify possible solutions. – MrFlick Aug 01 '18 at 18:06
  • Acq Date comes in as a string - I convert it to a date. Temp1 is a .csv file that I read in. – BB123 Aug 01 '18 at 18:53
  • Are you sure it’s read in as a string? Many R functions that import data convert string to factors. You should dput() a sample of your data so we can reproduce. – MrFlick Aug 01 '18 at 18:58
  • Getting the following comment: the condition has length > 1 and only the first element will be used – BB123 Aug 14 '18 at 14:01

3 Answers3

1
temp1 <-as.Date("11/10/1991",format="%m/%d/%Y")

Barrier1<-as.Date("1984-10-01",format="%Y-%m-%d")
Barrier2<-as.Date("1991-09-30",format="%Y-%m-%d")
Barrier3<-as.Date("1993-09-30",format="%Y-%m-%d")
Barrier4<-as.Date("1994-09-30",format="%Y-%m-%d")
Barrier5<-as.Date("1995-09-30",format="%Y-%m-%d")
Barrier6<-as.Date("2013-09-30",format="%Y-%m-%d")
Threshold <- if (temp1 < Barrier1) {         
   1000
  } else if (temp1 > Barrier1 & temp1 <= Barrier2)  {
   5000
  } else if (temp1 > Barrier2 & temp1 <= Barrier3) {
   15000
  } else if (temp1 > Barrier3 & temp1 <= Barrier4)  {
    25000
  } else if (temp1 > Barrier4 & temp1 <= Barrier5) {
    50000
  } else if (temp1 > Barrier5 & temp1 <= Barrier6) {
    100000
  } else if (temp1> Barrier6) {
    1000000
} else { 
  0
}

Basically showing that there has to be some problem with your data, because this code works.

RBeginner
  • 244
  • 3
  • 7
  • Are you sure that was the problem? It worked for me without the explicit as.Date. You can compare dates with strings as long as the string is in the form yyyy-mm-dd and comes second in the comparison. – MrFlick Aug 01 '18 at 20:03
  • I just tried it, so it's not the problem. So I guess the only thing this shows is that his code is fine after all (...). – RBeginner Aug 01 '18 at 20:29
  • 1
    nothing works. Apparently R is retarded when it comes to dates. – BB123 Aug 14 '18 at 05:24
1

An alternative would be to use cut and coerce to numeric:

set.seed(11)
temp1 <- data.frame(Acq_Dt = sample(seq(as.Date('1984-09-01'), as.Date('2013-11-01'), by = 'day'), 100))

breaks <- as.Date(c("1500-10-01", "1984-10-01", "1991-10-01", "1993-10-01", 
                    "1994-10-01", "1995-10-01", "2013-10-01", "2020-10-01"))

thresholds <- c(1000, 5000, 15000, 25000, 50000, 100000, 1000000)

temp1$Capthreshold <- as.numeric(as.character(cut(temp1$Acq_Dt, 
                                                  breaks = breaks,
                                                  labels = thresholds,
                                                  include.lowest = TRUE)))

Result:

        Acq_Dt Capthreshold
1   1992-10-02        15000
2   1984-09-06         1000
3   1999-07-24       100000
4   1985-01-28         5000
5   1986-07-21         5000
6   2012-07-04       100000
7   1987-03-11         5000
8   1993-02-13        15000
9   2010-05-03       100000
10  1988-04-04         5000
11  1989-10-08         5000
12  1997-07-05       100000
13  2011-02-06       100000
14  2009-06-17       100000
15  2006-01-18       100000
16  2001-05-18       100000
17  1998-09-13       100000
18  1994-04-18        25000
19  1989-04-04         5000
20  1998-08-25       100000
...
acylam
  • 18,231
  • 5
  • 36
  • 45
1
  library(dplyr)

  SB$Cap_Threshold<- case_when(
  SB$RECEIPT_DATE < "1984-10-01" ~ 1000,
  SB$RECEIPT_DATE >= "1984-10-01" & SB$RECEIPT_DATE <="1991-09-30" ~ 5000,
  SB$RECEIPT_DATE >= "1991-10-01" & SB$RECEIPT_DATE <="1993-09-30" ~ 15000,
  SB$RECEIPT_DATE >= "1993-10-01" & SB$RECEIPT_DATE <="1994-09-30" ~ 25000,
  SB$RECEIPT_DATE >= "1994-10-01" & SB$RECEIPT_DATE <="1995-09-30" ~ 50000,
  SB$RECEIPT_DATE >= "1995-10-01" & SB$RECEIPT_DATE <="2013-09-30" ~ 100000,
  SB$RECEIPT_DATE >= "2013-10-01"  ~ 1000000,
  TRUE ~ 999999999999999999
  )
BB123
  • 33
  • 1
  • 6
  • 1
    You should include `library(something)` at the top so that others who come by this answer later know how to use it. (There is no case_when in base R.) – Frank Aug 16 '18 at 20:54