I'm dealing with a fairly large dataset and trying to run a mixed-effects multilevel regression using the mixor
function. I was advised to switch dates to a categorical variable, as the current format is taking quite some time to run. My data is stored within a tibble "arr_full", with dates as a column within this tibble stored as "rec_date" as a POSIXct data type.
I don't want to change the order of any of the rows and was thinking there might be some way to do this simply with lubridate
. Basically I want any date that falls within periods of 15 days of each month stored as a categorical variable. So any date that falls within 15-30 Apr 2020 would be "1", 1-14 May 2020 as "2", and so on till 1-14 Apr 2021 as the last category "24".
I'm fairly new to R so any explanation of how to go about this and why would be extremely helpful. Thanks in advance for the help!
**edit: I've added my code below with the way I tried to do this. It's admittedly a bit roundabout to achieve it, but it's the way I could think of to do this. I made a separate column from my ordered date column, rec_date, and made rec_date1 that I want to be replaced by the specified variables.
I don't get any errors when I run the code, but then when I run arr_full$rec_date1
, it still outputs the original dates, not the replaced variables.
arr_full$rec_date1 <- ifelse(lubridate::month(arr_full$rec_date1) ==4 & lubridate::day(arr_full$rec_date1) >=15 , 1,
ifelse(lubridate::month(arr_full$rec_date1) ==5 & lubridate::day(arr_full$rec_date1) <15 , 2,
ifelse(lubridate::month(arr_full$rec_date1) ==5 & lubridate::day(arr_full$rec_date1) >=15 , 3,
ifelse(lubridate::month(arr_full$rec_date1) ==6 & lubridate::day(arr_full$rec_date1) <15 , 4,
ifelse(lubridate::month(arr_full$rec_date1) ==6 & lubridate::day(arr_full$rec_date1) >=15 , 5,
ifelse(lubridate::month(arr_full$rec_date1) ==7 & lubridate::day(arr_full$rec_date1) <15 , 6,
ifelse(lubridate::month(arr_full$rec_date1) ==7 & lubridate::day(arr_full$rec_date1) >=15 , 7,
ifelse(lubridate::month(arr_full$rec_date1) ==8 & lubridate::day(arr_full$rec_date1) <15 , 8,
ifelse(lubridate::month(arr_full$rec_date1) ==8 & lubridate::day(arr_full$rec_date1) >=15 , 9,
ifelse(lubridate::month(arr_full$rec_date1) ==9 & lubridate::day(arr_full$rec_date1) <15 , 10,
ifelse(lubridate::month(arr_full$rec_date1) ==9 & lubridate::day(arr_full$rec_date1) >=15 , 11,
ifelse(lubridate::month(arr_full$rec_date1) ==10 & lubridate::day(arr_full$rec_date1) <15 , 12,
ifelse(lubridate::month(arr_full$rec_date1) ==10 & lubridate::day(arr_full$rec_date1) >=15 , 13,
ifelse(lubridate::month(arr_full$rec_date1) ==11 & lubridate::day(arr_full$rec_date1) <15 , 14,
ifelse(lubridate::month(arr_full$rec_date1) ==11 & lubridate::day(arr_full$rec_date1) >=15 , 15,
ifelse(lubridate::month(arr_full$rec_date1) ==12 & lubridate::day(arr_full$rec_date1) <15 , 16,
ifelse(lubridate::month(arr_full$rec_date1) ==12 & lubridate::day(arr_full$rec_date1) >=15 , 17,
ifelse(lubridate::month(arr_full$rec_date1) ==1 & lubridate::day(arr_full$rec_date1) <15 , 18,
ifelse(lubridate::month(arr_full$rec_date1) ==1 & lubridate::day(arr_full$rec_date1) >=15 , 19,
ifelse(lubridate::month(arr_full$rec_date1) ==2 & lubridate::day(arr_full$rec_date1) <15 , 20,
ifelse(lubridate::month(arr_full$rec_date1) ==2 & lubridate::day(arr_full$rec_date1) >=15 , 21,
ifelse(lubridate::month(arr_full$rec_date1) ==3 & lubridate::day(arr_full$rec_date1) <15 , 22,
ifelse(lubridate::month(arr_full$rec_date1) ==3 & lubridate::day(arr_full$rec_date1) >=15 , 23,
ifelse(month(arr_full$rec_date1) ==4 & day(arr_full$rec_date1) <15 , 24, NA)))))))))))))))))))))))
> arr_full$rec_date1
[1] "2020-06-12 GMT" "2020-07-25 GMT" "2020-09-04 GMT" "2020-12-07 GMT" "2020-06-12 GMT" "2020-07-25 GMT"
[7] "2020-09-07 GMT" "2020-12-07 GMT" "2021-03-11 GMT" "2020-06-12 GMT" "2020-07-25 GMT" "2020-09-08 GMT"
[13] "2020-12-07 GMT" "2021-03-08 GMT" "2020-06-20 GMT" "2020-07-26 GMT" "2020-06-20 GMT" "2020-09-11 GMT"...