1

Master

Store No. Online Pressure MON TUE WED ... Hours
   1        0.2     50     0   0   0  ...  53
   2        0.8     20     0   0   0  ...  30
   3        1.2     10     0   0   0  ...  20
   ...

Hours

Branch No. Day ... Time
    1     MON     7.50
    1     TUE     6.00
    1     WED     8.50
    3     MON     2.00
    3     TUE     1.00
    3     WED     2.50
    ...

The idea is I want to populate the days of the week "Mon", "Tue", "Wed"...etc in the 'Master' table from the data in the 'Hours' Table

How would I go about matching the Branch No. with the Store No. and then matching the 'day' data with the corresponding day in 'Master' columns. So the output of the Master should be as follows

Store No. Online Pressure MON TUE WED ... Hours
   1        0.2     50   7.50 6.00 8.50  ...  53
   2        0.8     20     0   0   0  ...  30
   3        1.2     10   2.00 1.00 2.50  ...  20
   ...

It should do it for every row until all matched.

I apologise if I have explained poorly but I am new to R so only just learning my way around. I appreciate any help or edits welcome if it helps explain the problem to others

  merge(
  Master[!names(Master) %in% Hours$Day],
  reshape(Hours,
      direction = "wide",
      idvar = "Branch No.",
      timevar = "Day"),

  by.x = "Store No.",
  by.y = "Branch No.",
  all = TRUE
 )

Warm regards

Daisy
  • 121
  • 1
  • 9
  • Hi Daisy. It makes it much easier for others to help you if you make your question fully reproducible - i.e. add code to create the data object to test the code on. This post gives useful guidance https://stackoverflow.com/a/5963610/1156245 – geotheory Sep 24 '20 at 12:44

2 Answers2

1

Here is a base R option using merge + reshape

merge(
  Master[!names(Master) %in% Hours$Day],
  reshape(Hours,
    direction = "wide",
    idvar = "Branch No.",
    timevar = "Day"
  ),
  by.x = "Store No.",
  by.y = "Branch No.",
  all = TRUE
)

which gives

  Store No. Online Pressure Hours Time.MON Time.TUE Time.WED
1         1    0.2       50    53      7.5        6      8.5
2         2    0.8       20    30       NA       NA       NA
3         3    1.2       10    20      2.0        1      2.5

where reshape transform df2 from long to wide data frame, and merge helps merge data.

Data

Master <- structure(list(`Store No.` = c(1, 2, 3), Online = c(0.2, 0.8, 
1.2), Pressure = c(50, 20, 10), MON = c(0, 0, 0), TUE = c(0, 
0, 0), WED = c(0, 0, 0), Hours = c(53, 30, 20)), row.names = c(NA, 
-3L), class = c("data.frame"))

Hours <- structure(list(`Branch No.` = c(1, 1, 1, 3, 3, 3), Day = c("MON", 
"TUE", "WED", "MON", "TUE", "WED"), Time = c(7.5, 6, 8.5, 2, 
1, 2.5)), row.names = c(NA, -6L), class = c(
"data.frame"))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • Thank you for the above, I will give it a go now...but is this the most efficient way - as there are about 1,500 Stores...so there is so much data and would I need to enter them in manually? as df1 and df2? – Daisy Sep 24 '20 at 12:55
  • @Daisy I don't think you need to manually type those data. There might be some other more efficient way, but you can check my solution's performance on your real data – ThomasIsCoding Sep 24 '20 at 12:58
  • Hey! I am not getting any errors but df1 (Master) doesnt seem to be updating...i.e. MON -SUN are all still blank...Do you know why that might be? from the code above where does the 'Time' column come into play for df2 – Daisy Sep 24 '20 at 13:16
  • @Daisy `MON`,`TUE`... in `Day` column of `df2` is changed to column names `Time.MON`, `Time.TUE` ... when using `reshape` since it reorganizes the values in `Time` column with respect to the `Day` column. The reshaped `df2` is merged to `df1` after deleting the original related columns in `df1`, so all data should come from `df2` – ThomasIsCoding Sep 24 '20 at 13:28
  • I have added the written code above in the question, at the moment I am still getting nothing...no data is being merged to df1 as in Time.Mon,Time,Tue...etc..Am I going wrong somewhere in the code perhaps? – Daisy Sep 24 '20 at 13:37
  • @Daisy It works well with me. See my update and the data in my answer. You'd better put the data from `dput(Master)` and `dput(Hours)` to the post. – ThomasIsCoding Sep 24 '20 at 13:43
  • Perhaps I have misunderstood as I am still learning R...The Data section which you have posted is that something I need to code into the model also? The reason I ask is because you've written the stores individually which is something I cannot do due to the real data being large. Or is the Data, which is what I interpreted to be, the output in the console? – Daisy Sep 24 '20 at 13:50
  • @Daisy Yes, you can try `dput(head(Master))` and copy the output in console. Did you try the data in my answer? I think it is weird my code doesn't work on your dataset – ThomasIsCoding Sep 24 '20 at 14:09
  • I am so sorry, I am so confused here...So the 'Data' code (structure(list) which you have posted in your question is not needed? and I only need the 'Base R Option' code? So I have written those lines of code in my question and it runs by data shows in the Master...So you would like to see the console output? is this correct? even if the console tables are big? – Daisy Sep 24 '20 at 14:30
  • Thank you so much for your help and patience, really appreciate it – Daisy Sep 24 '20 at 14:31
1

Hope this helps:

> dput(Master)
structure(list(`Store No` = c(1, 2, 3), Online = c(0.2, 0.8, 
1.2), Pressure = c(50, 20, 10), MON = c(0, 0, 0), TUE = c(0, 
0, 0), WED = c(0, 0, 0), Hours = c(53, 30, 20)), row.names = c(NA, 
-3L), class = c("tbl_df", "tbl", "data.frame"))
> dput(Hours)
structure(list(`Branch No` = c(1, 1, 1, 3, 3, 3), Day = c("MON", 
"TUE", "WED", "MON", "TUE", "WED"), Time = c(7.5, 6, 8.5, 2, 
1, 2.5)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))
> Master
# A tibble: 3 x 7
  `Store No` Online Pressure   MON   TUE   WED Hours
       <dbl>  <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
1          1    0.2       50     0     0     0    53
2          2    0.8       20     0     0     0    30
3          3    1.2       10     0     0     0    20
> Hours
# A tibble: 6 x 3
  `Branch No` Day    Time
        <dbl> <chr> <dbl>
1           1 MON     7.5
2           1 TUE     6  
3           1 WED     8.5
4           3 MON     2  
5           3 TUE     1  
6           3 WED     2.5
> Master %>% pivot_longer(cols = c(MON, TUE, WED)) %>% inner_join(Hours, by = c('name' = 'Day', 'Store No' = 'Branch No')) %>% 
+   select(`Store No`, Online, Pressure, name, Hours, Time) %>% pivot_wider(names_from = name, values_from = Time)
# A tibble: 2 x 7
  `Store No` Online Pressure Hours   MON   TUE   WED
       <dbl>  <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
1          1    0.2       50    53   7.5     6   8.5
2          3    1.2       10    20   2       1   2.5
> 
Karthik S
  • 11,348
  • 2
  • 11
  • 25