1

The data frame entails a date column in the format YYYY.WW. It is imported from a csv. and classified as numeric which does not allow me to properly use it for time series analysis or ggplot visualisations. I tried to transform it as such but I get an error.

df_full <- transform(df_full, DELY_WK = as.Date(as.numeric(DELY_WK), "%Y.%W"))

Error in charToDate(x) : 
  character string is not in a standard unambiguous format

Looking at the values, they appear unambiguous to me and to the best of my knowledge, does %Y accept a four digit and %W accept a two digit integer ranging from 00 to 53 with Monday as first day of the week as input.

> class(df_full)
[1] "data.frame"
> unique(df_full$DELY_WK)
  [1] 2015.05 2015.06 2015.07 2015.08 2015.09 2015.10 2015.11 2015.12 2015.13 2015.14 2015.15 2015.16 2015.17 2015.19 2015.21
 [16] 2015.22 2015.23 2015.24 2015.25 2015.26 2015.27 2015.28 2015.29 2015.30 2015.31 2015.34 2015.36 2015.37 2015.38 2015.39
 [31] 2015.40 2015.41 2015.42 2015.43 2015.44 2015.45 2015.46 2015.47 2015.48 2015.49 2015.50 2015.51 2016.03 2016.04 2016.05
 [46] 2016.06 2016.07 2016.08 2016.09 2016.10 2016.11 2016.12 2016.13 2016.14 2016.15 2016.16 2016.17 2016.19 2016.20 2016.21
 [61] 2016.22 2016.23 2016.24 2016.25 2016.26 2016.27 2016.28 2016.29 2016.30 2016.31 2016.36 2016.37 2016.38 2016.39 2016.40
 [76] 2016.41 2016.42 2016.43 2016.44 2016.45 2016.46 2016.47 2016.48 2016.49 2016.50 2016.51 2017.01 2017.02 2017.03 2017.04
 [91] 2017.05 2017.06 2017.07 2017.08 2017.09 2017.10 2017.11 2017.12 2017.13 2017.14 2017.15 2017.16 2017.17 2017.18 2017.19
[106] 2017.20 2017.21 2017.22 2017.23 2017.24 2017.25 2017.26 2017.27 2017.28 2017.29 2017.30 2017.31 2017.36 2017.37 2017.38
[121] 2017.39 2017.40 2017.41 2017.42 2017.43 2017.44 2017.45 2017.46 2017.47 2017.48 2017.49 2017.50 2017.51 2018.01 2018.02
[136] 2018.03 2018.04 2018.05 2018.06 2018.07 2018.08 2018.11 2018.12 2018.13 2018.14 2018.15 2018.16 2018.17 2018.19 2018.20
[151] 2018.21 2018.22 2018.23 2018.24 2018.25 2018.27 2018.09 2018.26 2018.10 2018.29 2018.34 2018.35 2018.36 2018.37 2018.38
[166] 2018.39 2018.40 2018.41 2018.42 2018.43 2018.44 2018.45 2018.46 2018.47 2018.48 2018.49 2018.50 2018.51 2019.01 2019.02
[181] 2019.03 2019.04 2019.05 2019.06 2019.07 2019.08 2019.09 2019.10 2019.11 2019.12 2019.13 2019.14 2019.15 2019.16 2019.17
[196] 2019.18 2019.19 2019.20 2019.21 2019.22 2019.23 2019.24 2019.25 2019.26 2019.27 2019.28 2019.33 2019.34 2019.35 2019.36
[211] 2019.37 2019.38 2019.39 2019.40 2019.41 2019.42 2019.43 2019.44 2019.45 2019.46 2019.47 2019.48 2019.49 2019.50 2019.51
[226] 2020.04 2020.05 2020.06 2020.07 2020.08 2020.09 2020.10 2020.11 2020.12 2020.13 2020.14 2020.15 2020.16 2020.19 2020.20
[241] 2020.23 2020.25 2020.26 2020.27
> class(df_full$DELY_WK)
[1] "numeric"
Max
  • 185
  • 1
  • 11

2 Answers2

2

We need to convert to character instead of numeric and then it should work (assuming it is a numeric or factor class) after adding a day as well because the week number is not specific for a Date. It can be any day of the week. So, we paste a number first day and change the format for the week day number

df_full <- transform(df_full, 
     DELY_WK = as.Date(paste0(as.character(DELY_WK), '.01'), "%Y.%W.%w"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • It certainly transforms the value to a date, but removes nearly all values and remains with limited values in the column. ```> unique(df_full$DELY_WK) [1] "2015-08-13" "2016-08-13" "2017-08-13" "2018-08-13" "2019-08-13" "2020-08-13" > class(df_full$DELY_WK) [1] "Date"``` – Max Aug 13 '20 at 21:18
  • @Max Based on your post, the format is for "%W" i.e 2 digit week number from 1 to 52 – akrun Aug 13 '20 at 21:19
  • yes, that is correct. Although %W accepts the ISO format and hence 1-53 weeks. The output I showed in my last comment was the result after I changed to your line of code. – Max Aug 13 '20 at 21:26
  • @Max Can you update with the dput of your DELY_WK column. around 10-20 values is enough – akrun Aug 13 '20 at 21:28
  • @Max I guess I got the issue. You need a day as well because week can have any days and it is not specific – akrun Aug 13 '20 at 21:33
  • Here you go. > dput(df_full[1:20, "DELY_WK"]) c(2015.05, 2015.06, 2015.07, 2015.08, 2015.09, 2015.1, 2015.11, 2015.12, 2015.13, 2015.14, 2015.15, 2015.16, 2015.17, 2015.19, 2015.21, 2015.22, 2015.23, 2015.24, 2015.25, 2015.26) – Max Aug 13 '20 at 21:36
  • 1
    @Max I get `as.Date(paste0(as.character(v1), '.01'), '%Y.%W.%w')# [1] "2015-02-01" "2015-02-08" "2015-02-15" "2015-02-22" "2015-03-01" "2015-01-04" "2015-03-15" "2015-03-22" "2015-03-29" "2015-04-05" [11] "2015-04-12" "2015-04-19" "2015-04-26" "2015-05-10" "2015-05-24" "2015-05-31" "2015-06-07" "2015-06-14" "2015-06-21" "2015-06-28"` – akrun Aug 13 '20 at 21:37
  • 1
    Ok, great. I'll try that. Might have to move it in a new column as I make an inner join on the DELY_WK column later on. Plus I could format the dates on the x-axis in ggplot later on to %W-%Y which is beneficial. – Max Aug 13 '20 at 21:40
0

I ended up transforming the date field and adding a new 'ISOdate' column, so that it fits the ISOweek package. This was in particular helpful in cases when there is a 53rd week in a year. Plotting time series is working fine now with this new helper column.

ISOweek: Week of the year and weekday according to ISO 8601

Max
  • 185
  • 1
  • 11