good day i hope I will be able to get help with this specific topic, I want to merge my one file to another file only if it has that specific outlet and within that specific month but the rows should not be duplicated.
so my file 1 has the Outlet, Month, Sales while file 2 I want to merge to has the Outlet, Month, Price and Salesvolume, however File 2 can have many instances of a single store for a single month hence the end result might have way too many duplicated entries for a single month.
so If i have Outlet 2453 for Feb 2016 in File 1 then only one entry should represent that particular month with the extra columns received from file2.
My current code:
BPfac <- merge(File1, File2,by=c("Outlet","Month"))
BPfac1 = BPfac[which(!duplicated(File1$Outlet) & !duplicated(File2$Month)),]
This doesn't generate the result that i want is there a better way of doing this, I only get two rows from a possible merge of 99K Outlets. The month is coded so Month 256 might represent Feb 2017 that's how it works. I just want the end result to take the first record only when matching between the two files.
Thanks in advance!
File 1
Month Sales Outlet
256 24 2453
256 500 2453
243 48 2341
242 47 2356
243 89 2356
File 2
Month Price Salesvolume Outlet
256 1840 222700 2453
256 1840 237600 2453
243 1840 277100 2341
242 1840 279200 2356
243 1840 451400 2356
Expected Outcome
Month Price Sales Salesvolume Outlet
256 1840 24 222700 2453
243 1840 48 277100 2341
242 1840 47 279200 2356
See above the outcome only took the first outlet and month that matched between the two files and discarded the second outlet with the same Outlet and period, hope it makes sense now.