1

I am currently working with four multiple data frames listed below.

  • tmp: Master table (variables included: group, weight, country, revenue)
  • price1: Price table 1 for group 1 - 3(variables included: group, weight, price)
  • price2: Price table 2 for group 4 - 8(variables included: group, price)
  • price3: Price table 3 for group 9(variables included: group, country, weight, price)

"weight" is an integer variable starting from 1 to 200. In price table 1 and 3, there is a price value assigned to each "weight".

I want to keep the master table as is and conditionally merge the price table 1,2,3 to it. The "group" variable in master table range from 1 to 9. The price tables were managed based on the "group" value as well. Each value represents one single product. Product in the same price table has similar nature with different price. However, products across price tables have different nature. Because of this difference, the merge criteria is different as well.

The purpose of this merge is to compare revenue to price for different products.

The following is the merge criteria between master table and 3 price table.

  • Primary key in Price table 1 to master table: group & weight
  • Primary key in Price table 2 to master table: group
  • Primary key in Price table 3 to master table: group & country & weight

Used code:

test <- ifelse(tmp$group %in% c(1,2,3), merge(tmp,price1,by=c("group","weight"))
            ,ifelse(tmp$group %in% c(4,5,6,7,8),merge(tmp, price2, by= "group")
                   ,merge(tmp,price3, by=c("group","country","weight")))))

The problem with my code is that the return object is not a data frame, it showed as a large list in R.

Jaap
  • 81,064
  • 34
  • 182
  • 193
Megan.Zzz
  • 29
  • 5
  • Please include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). – Jaap Jun 15 '16 at 19:38
  • Furthermore: [this Q&A](http://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) might be helpfull. – Jaap Jun 15 '16 at 19:41

2 Answers2

0

If I didn't understand wrongly, you are looking up price of the product in the master across the price data.frames. As the groups are unique in your example, it would be preferable first merge the price data.frames first, before merging with your master data.

data:

price1 <- data.frame(group=1:3, weight=sample(1:200, 3, replace=T), price=abs(rnorm(3, 15, 1)))
price2 <- data.frame(group=4:8, price=abs(rnorm(5, 15, 1)))
price3 <- data.frame(group=9, country="A", weight=sample(1:200, 1), price=abs(rnorm(1, 15, 1)))
tmp <- data.frame(group=1:10, weight=sample(1:200, 10, replace=T), country=sample(LETTERS, 10, replace=T), revenue=abs(rnorm(10, 150, 2)))

Merge the three price data.frame with plyr::ldply

library(plyr)
price <- plyr::ldply(list(price1, price2, price3))

Retain only two columns for the merged price data.frame:

library(dplyr)
price <- price %>% select(price, group)

Join the master data.frame with the price by group

output <- tmp %>% left_join(price, by="group")
   group weight country  revenue    price
1      1    196       N 149.4803 15.52752
2      2     55       L 150.3930 15.98541
3      3     78       E 150.6139 14.95468
4      4     62       D 151.4679 16.51612
5      5    107       Q 149.3292 15.59761
6      6    168       C 148.8256 14.49331
7      7    193       L 149.0341 16.31310
8      8     18       X 152.9192 14.06301
9      9    113       O 147.0069 14.08821
10    10    197       H 149.3581       NA
Adam Quek
  • 6,973
  • 1
  • 17
  • 23
0

Consider a row bind on subsetted tmp dataframes, each of which are merged with price dataframes:

test <- rbind(merge(tmp[tmp$group %in% c(1,2,3),], price1, by=c("group","weight")),
              merge(tmp[tmp$group %in% c(4,5,6,7,8),], price2, by="group"),
              merge(tmp[tmp$group == 9,], price3, by=c("group","country","weight")))
Parfait
  • 104,375
  • 17
  • 94
  • 125