0

Given a time series entailing data about cinemas, the identifier "dates" are of interest. I would like to convert into the format "YYYY/MM/DD." However, when I run my code:

CINEMA.TICKET$DATE <- as.Date(CINEMA.TICKET$date , format = "%y/%m/%d")

Two issues occur: First, the dates are shown on the far right of the table as, e.g. , "0005-05-20." And many entries disappear entirely. Can someone explain what I am doing wrong, and how can I do it properly?

film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity     date month quarter day    newdate       DATE
1      1492         304     3900000           26           0         4      4.26       150000         26 610.3286 5/5/2018     5       2   5 0005-05-20 2005-05-20
2      1492         352     3360000           42           0         5      8.08        80000         42 519.8020 5/5/2018     5       2   5 0005-05-20 2005-05-20
3      1492         489     2560000           32           0         4     20.00        80000         32 160.0000 5/5/2018     5       2   5 0005-05-20 2005-05-20
4      1492         429     1200000           12           0         1     11.01       100000         12 108.9918 5/5/2018     5       2   5 0005-05-20 2005-05-20
5      1492         524     1200000           15           0         3     16.67        80000         15  89.9820 5/5/2018     5       2   5 0005-05-20 2005-05-20
6      1492          71     1050000            7           0         3      0.98       150000          7 714.2857 5/5/2018     5       2   5 0005-05-20 2005-05-20
> str(CINEMA.TICKET)
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • Can you edit your question and provide your data via `dput(CINEMA.TICKET)`? – AndrewGB Dec 28 '21 at 23:50
  • 3
    For 4 digit years you need to use "%Y". Also the format needs to be in the correct order. So try format = "%m/%d/%Y" – Dave2e Dec 28 '21 at 23:53
  • @AndrewGillreath-Brown I am, frankly, unfamiliar with the dput function. I am new to R, apologies. Do you mind sharing what its purpose is? – SaltySenator Dec 29 '21 at 00:28
  • @SaltySenator Sure! "`dput` throws all information needed to exactly reproduce your data on your console. You may simply copy the output and paste it into your question." It helps to make your question reproducible, which is important on SO. You can read more about `dput` at: https://stackoverflow.com/a/5963610/15293191 – AndrewGB Dec 29 '21 at 01:02

3 Answers3

1

As @Dave2e pointed out. You are looking for:

CINEMA.TICKET[, date := as.Date(date , format = "%d/%m/%Y")]

assuming our input format is "30/5/2018" since question is not clear with an example of "5/5/2018" where this could be "%d/%m/%Y" or "%m/%d/%Y"

As for ordering columns use:

setcolorder(CINEMA.TICKET, c("c", "b", "a"))

where c,b,a are column names in their desired order

Sweepy Dodo
  • 1,761
  • 9
  • 15
1

lubridate probably does the trick

> lubridate::mdy("5/5/2018")
[1] "2018-05-05"

So you should use

library(lubridate)
library(tidyverse)

CINEMA.TICKET <- CINEMA.TICKET %>% 
  mutate(DATE=mdy(date))
Martin
  • 307
  • 1
  • 10
0

Here is another option:

library(tidyverse)

output <- df %>% 
  mutate(date = as.Date(date, format="%m/%d/%Y"))

Output

  film_code cinema_code total_sales tickets_sold tickets_out show_time occu_perc ticket_price ticket_use capacity       date month quarter day
1      1492         304     3900000           26           0         4      4.26       150000         26 610.3286 2018-05-05     5       2   5
2      1492         352     3360000           42           0         5      8.08        80000         42 519.8020 2018-05-05     5       2   5
3      1492         489     2560000           32           0         4     20.00        80000         32 160.0000 2018-05-05     5       2   5
4      1492         429     1200000           12           0         1     11.01       100000         12 108.9918 2018-05-05     5       2   5
5      1492         524     1200000           15           0         3     16.67        80000         15  89.9820 2018-05-05     5       2   5
6      1492          71     1050000            7           0         3      0.98       150000          7 714.2857 2018-05-05     5       2   5

To have date classified as a date, you cannot have the forward slash. You can change the format, but it will no longer be classified as date, but will be classified as character again.

class(output$date)
# [1] "Date"

output2 <- df %>% 
  mutate(date = as.Date(date, format="%m/%d/%Y")) %>% 
  mutate(date = format(date, "%Y/%m/%d"))

class(output2$date)
# [1] "character"

Data

df <-
  structure(
    list(
      film_code = c(1492L, 1492L, 1492L, 1492L, 1492L,
                    1492L),
      cinema_code = c(304L, 352L, 489L, 429L, 524L, 71L),
      total_sales = c(3900000L,
                      3360000L, 2560000L, 1200000L, 1200000L, 1050000L),
      tickets_sold = c(26L,
                       42L, 32L, 12L, 15L, 7L),
      tickets_out = c(0L, 0L, 0L, 0L, 0L,
                      0L),
      show_time = c(4L, 5L, 4L, 1L, 3L, 3L),
      occu_perc = c(4.26,
                    8.08, 20, 11.01, 16.67, 0.98),
      ticket_price = c(150000L, 80000L,
                       80000L, 100000L, 80000L, 150000L),
      ticket_use = c(26L, 42L, 32L,
                     12L, 15L, 7L),
      capacity = c(610.3286, 519.802, 160, 108.9918,
                   89.982, 714.2857),
      date = c("5/5/2018", "5/5/2018", "5/5/2018", "5/5/2018",
               "5/5/2018", "5/5/2018"),
      month = c(5L, 5L, 5L, 5L, 5L, 5L),
      quarter = c(2L,
                  2L, 2L, 2L, 2L, 2L),
      day = c(5L, 5L, 5L, 5L, 5L, 5L)
    ),
    class = "data.frame",
    row.names = c(NA,-6L)
  )
AndrewGB
  • 16,126
  • 5
  • 18
  • 49