R Merge using Reduce Function duplicate columns

Question

I'm using the below code to merge different data frames

Mth_TaT_Data <- Reduce(function(x,y) merge(x,y,by= "Order.Comment.1",all=TRUE), 
                       list(n_Cases,n_Auth_Cases,TaTless3days,TaT4to7days,TaT8to10days,
                            TaT11to20days,TaTgreater20days,mth_Mean,rpted_50_percent,
                            rpted_80_percent, rpted_90_percent))

This is how I am generating the data frames:

 rpted_90_percent <- TaT_Data %>%
  group_by(Order.Comment.1,Year) %>%
  filter(!is.na(Turnaround)) %>%
  summarise( rpted.90.percent = quantile(Turnaround, c(.9)) , .groups = 'drop')

But when I use reduce to merge the fields it is merging the date identifier each time. Can I stop this or should I look at using a different method to merge (inner_join?)

structure(list(Order.Comment.1 = structure(1:6, .Label = c("", 
"AB", "TD", "QE", "EF", "ASD", "HG", "AF", "DBF", "ASE", "AFF", 
"TRJ", "FJ", "ASFB", "UK", "ALDS", "XKC", "KCS", "LSN", "AVN", 
"LAB", "JVA", "ACN", "AKCN", "GMN", "FNDS", "SFNB", "SV"), class = "factor"), 
Year.x = c("2020-07", "2020-07", "2020-07", "2020-07", "2020-07", 
"2020-07"), n.Cases = c(3L, 221L, 105L, 291L, 153L, 1L), 
Year.y = c("2020-07", "2020-07", "2020-07", "2020-07", "2020-07", 
NA), n.Auth.Cases = c(3L, 221L, 105L, 291L, 148L, NA), Year.x = c("2020-07", 
"2020-07", "2020-07", "2020-07", "2020-07", NA), less_3_days = c(1L, 
80L, 7L, 125L, 46L, NA), Year.y = c("2020-07", "2020-07", 
"2020-07", "2020-07", "2020-07", NA), bet_4to7_days = c(1L, 
90L, 38L, 125L, 54L, NA), Year.x = c(NA, "2020-07", "2020-07", 
"2020-07", "2020-07", NA), bet_8to10_days = c(NA, 34L, 35L, 
26L, 14L, NA), Year.y = c("2020-07", "2020-07", "2020-07", 
"2020-07", "2020-07", NA), bet_11to20_days = c(1L, 17L, 20L, 
14L, 18L, NA), Year.x = c(NA, NA, "2020-07", "2020-07", "2020-07", 
"2020-07"), greater_20_days = c(NA, NA, 5L, 1L, 21L, 1L), 
Year.y = c("2020-07", "2020-07", "2020-07", "2020-07", "2020-07", 
"2020-07"), `mean(Turnaround)` = c(8, 5.34841628959276, 8.99047619047619, 
55.8969072164948, 9.08496732026144, 44), Year.x = c("2020-07", 
"2020-07", "2020-07", "2020-07", "2020-07", "2020-07"), rpted.50.percent = c(`50%` = 7, 
`50%` = 5, `50%` = 8, `50%` = 4, `50%` = 5, `50%` = 44), 
Year.y = c("2020-07", "2020-07", "2020-07", "2020-07", "2020-07", 
"2020-07"), rpted.80.percent = c(`80%` = 11.2, `80%` = 8, 
`80%` = 11, `80%` = 7, `80%` = 13, `80%` = 44), Year = c("2020-07", 
"2020-07", "2020-07", "2020-07", "2020-07", "2020-07"), rpted.90.percent = c(`90%` = 12.6, 
`90%` = 9, `90%` = 15, `90%` = 8, `90%` = 28.8, `90%` = 44
)), row.names = c(NA, 6L), class = "data.frame")

What I want is something like:

structure(list(Order.Comment.1 = structure(1:6, .Label = c("", 
"AB", "TD", "QE", "EF", "ASD", "HG", "AF", "DBF", "ASE", "AFF", 
"TRJ", "FJ", "ASFB", "UK", "ALDS", "XKC", "KCS", "LSN", "AVN", 
"LAB", "JVA", "ACN", "AKCN", "GMN", "FNDS", "SFNB", "SV"), class = "factor"), 
Year.x = c("2020-07", "2020-07", "2020-07", "2020-07", "2020-07", 
"2020-07"), n.Cases = c(3L, 221L, 105L, 291L, 153L, 1L), 
n.Auth.Cases = c(3L, 221L, 105L, 291L, 148L, NA),  less_3_days = c(1L, 
80L, 7L, 125L, 46L, NA),  bet_4to7_days = c(1L, 
90L, 38L, 125L, 54L, NA),  bet_8to10_days = c(NA, 34L, 35L, 
26L, 14L, NA),  bet_11to20_days = c(1L, 17L, 20L, 
14L, 18L, NA),  greater_20_days = c(NA, NA, 5L, 1L, 21L, 1L), 
`mean(Turnaround)` = c(8, 5.34841628959276, 8.99047619047619, 
55.8969072164948, 9.08496732026144, 44),  rpted.50.percent = c(`50%` = 7, 
`50%` = 5, `50%` = 8, `50%` = 4, `50%` = 5, `50%` = 44), 
rpted.80.percent = c(`80%` = 11.2, `80%` = 8, 
`80%` = 11, `80%` = 7, `80%` = 13, `80%` = 44), rpted.90.percent = c(`90%` = 12.6, 
`90%` = 9, `90%` = 15, `90%` = 8, `90%` = 28.8, `90%` = 44
)), row.names = c(NA, 6L), class = "data.frame")

Why not `merge` on `Option.Comment.1` AND `Year` (i.e., two items in `by` argument)? — Parfait, Sep 21 '20 at 21:53
@Parfait I get the below error message when i do that: **Error in "Order.Comment.1" & "Year" : operations are possible only for numeric, logical or complex types** CODE-> `'Mth_TaT_Data <- Reduce(function(x,y) merge(x,y,by= "Order.Comment.1" & "Year",all=TRUE), list(n_Cases,n_Auth_Cases,TaTless3days,TaT4to7days,TaT8to10days, TaT11to20days,TaTgreater20days,mth_Mean,rpted_50_percent, rpted_80_percent, rpted_90_percent))' `` — James Slasor, Sep 22 '20 at 05:33
Pass column names as a two-item character vector: `merge(x, y, by=c("Order.Comment.1", "Year"), all=TRUE)`. Or leave out `by` arg since it [defaults to matching names](https://stackoverflow.com/questions/6709151/how-do-i-combine-two-data-frames-based-on-two-columns/6709335#6709335). — Parfait, Sep 22 '20 at 14:29
@Parfait That works perfectly. Thank you. I tried that myself and couldn't get it to work. Obliviously i wasn't doing it right — James Slasor, Sep 22 '20 at 17:47

R Merge using Reduce Function duplicate columns

0 Answers0