0

I want to create a new column (FallCount) in my dataframe (TowpathMerge) if a student is enrolled 20 or more days during the time frame of August 15,2017 to November 30, 2017. If they are, FallCount would read yes. If not, FallCount would read no. EffectiveStartDateFS is when a student enrolled. DistrictWithdrawDateFS is when a student withdrew.

For example, Joshua King would have Yes in Fall Count because he started school August 14, but didn't withdraw until January of 2018. White would also count because he started school on August 21, but withdrew more than 20 days later on September 28. On the other hand, Clark would not count because she started school on August 21 but withdrew August 29.

enter image description here

rollstuhlfahrer
  • 3,988
  • 9
  • 25
  • 38
Melissa
  • 9
  • 2
  • 3
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Mar 12 '18 at 19:03
  • Sorry! First time asking a question! My picture would not post. – Melissa Mar 12 '18 at 19:08
  • 4
    Welcome to the site! But pictures of data are not particularly helpful; share your data in a reproducible format as described in the link I provided. It makes it much easier for others to help. Also give the desired output. – MrFlick Mar 12 '18 at 19:09
  • No problem. Have a look at link shared by @MrFlick. It explains what details to be added as part of your question. You'll learn in the process. – MKR Mar 12 '18 at 19:10
  • 1
    Add the result of `dput(yourDATA)` to your question – Andre Elrico Mar 12 '18 at 19:12
  • 2
    Also add the DESIRED RESULT. Then we can SEE what you want. – Andre Elrico Mar 12 '18 at 19:14

2 Answers2

0

Yo need to express your dates as dates, with base R or with packages such as lubridate. For example

Lowest <- mdy("August 15,2017") 

or whatever format you use.

The you can work as you do with other quantities.

TowpathMerge$FallCount  <- ifelse(time1>=Lowest & 
   time2>=Highest & (time2-time1) >=20, TRUE, FALSE )

Adding TowpathMerge$ to the variables.

Juanjo
  • 58
  • 5
0

You just need to cast the dates from character arrays to R's date objects. After that it is just a matter of subtracting them and evaluating their results.

Assuming df is the name of the data.frame, this statement will put TRUE if the difference is greater than 20 else FALSE:

df$FallCount <- as.difftime(as.Date(df$DistrictWithdrawDateFS, "%Y-%m-%d") - as.Date(df$EffectiveStartDateFS, "%Y-%m-%d"), units = "days") > 20

Note: This does not handle scenarios with NA in either of the dates. You may want to add your own logic to handle such situations.

windrunn3r.1990
  • 414
  • 2
  • 6