0

I have one data set. This data set contain three columns. First have data in date format, second column have information about sales from different type of stores and third column sales contain data about sales. Below you can see data and screen shot of data.

            #Input data 
            library(dplyr)
            my_data_set<-structure(list(date_of_sale = structure(c(1556668800, 1556755200, 
                                                                       1556841600, 1556928000, 1557014400, 1557100800, 1557187200, 1557273600, 
                                                                       1557360000, 1556668800, 1556755200, 1556841600, 1556928000, 1557014400, 
                                                                       1557100800, 1557187200, 1557273600, 1557360000, 1556668800, 1556755200, 
                                                                       1556841600, 1556928000, 1557014400), class = c("POSIXct", "POSIXt"
                                                                       ), tzone = "UTC"), Stores = c(1, NA, 1, 1, 1, 1, 1, 1, 1, 2, 
                                                                                                     NA, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3), Sales = c(50, NA, 50, 
                                                                                                                                                        30, 50, 20, 30, 20, 20, 50, NA, 20, 50, 30, 40, 20, 20, 10, 80, 
                                                                                                                                                        20, 20, 20, 20)), row.names = c(NA, -23L), class = c("tbl_df", 
                                                                                                                                                                                                             "tbl", "data.frame")) 
 my_data_set<-data.frame(my_data_set)
 my_data_set$date_of_sale<-as.Date(my_data_set$date_of_sale)

 View(my_data_set) 

enter image description here

On this data set one of most important column here is first column. This column have irregular frequency of sales about each three stores. For example stores 1 and 2 have same sales in each day in period which start from 5.1.2019 to 5.9.2019. Store with number 3 have only sales in three days. In order to deal with this irregularity I want to create regular sequences of date and make left join function between this tables.You can see code below

    DATE_MATRIX<-data.frame(date_of_sale = seq.Date(as.Date("2019-05-01"), as.Date("2019-05-09"), by = "day"))
    Final_set<-left_join(DATE_MATRIX,my_data_set, by = c("date_of_sale"="date_of_sale"))              

I try with code above but this code can provide me shape of Final_set table which I want.My final goal should be table like example below.So can anybody help me how to figure out this problem with dplyr or some pivot table ?

enter image description here

silent_hunter
  • 2,224
  • 1
  • 12
  • 30
  • 1
    One option is `my_data_set %>% filter(!is.na(Stores)) %>% group_by(Stores) %>% mutate(rn = row_number()) %>% ungroup %>% pivot_wider(names_from = Stores, values_from = Sales)` – akrun Dec 29 '19 at 18:28
  • 1
    Or `reshape(transform(subset(my_data_set, !is.na(Stores)), date_of_sale = as.Date(date_of_sale)), idvar = "date_of_sale", timevar = "Stores", direction = "wide")` – markus Dec 29 '19 at 18:31
  • 1
    Why not simply `my_data_set %>% spread(Stores, Sales)` if the expected output still has `NA`'s? – Rui Barradas Dec 29 '19 at 18:34
  • I try all this three codes.I think second code with reshape give best results.Only think which is missing is arranging of dates. – silent_hunter Dec 29 '19 at 18:37

0 Answers0