0

I need to merge several different dataframes.

On the one hand, I have several data frames with metadata A and, on the other hand, respective information B.

A.
[1] "LOJun_Meta" "LOMay_Meta" "VOJul_Meta" "VOJun_Meta" "VOMay_Meta" "ZOJun_Meta"
[7] "ZOMay_Meta"

B.
[1] "LOJun_All." "LOMay_all." "VOJul_All." "VOJun_all." "VOMay_all." "ZOJun_all."
[7] "ZOMay_all."

The names of the data frames are already in a list format (i.e. list1 and list2) and the data frames are already imported in R.

My aim is to create a loop which would merge dplyr > left-join the respective dataframes. For example:

LOJun_Meta + LOJun_All; LoMay_Meta + LOJun_all etc...

What I have a hard time on is creating the loop that would "synchronize" the "merging" procedure.

I am unsure if I should create a function which would have two inputs and would do such "merging".

It would be something like

merging(list1, list2){
  for i in length(list):
    left_join(list1[i], list[2], by = c("PrimaryKey" = "ForeignKey"))
}

I reckon the problem is that the function should refer to data frames which are not list1 & list2 values but data frame names stored in list1 & list2.

Any ideas?

Thanks a lot! Cheers

A diagram of what I intend to achieve is presented below:

[Diagram of loop - dplyr / several dataframes1

An example of what I am keen to automate would be this action: ZOMay<- left_join(ZOMay_Meta, ZOMay_all., by = c("Primary Key" = "Foreign key")) ZOJun<- left_join(ZOJun_Meta, ZOJun_all., by = c("Primary Key" = "Foreign Key")) write.csv(ZOMay, file = "ZOMay_Consolidated.csv") write.csv(ZOMay, file = "ZOJun_Consolidated.csv")

Gustavo TA
  • 27
  • 3
  • 1
    I strongly recommend you read how to provide a [minimal, reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). You'll find that you get a lot more help from a question with that kind of structure – alexwhan Sep 11 '18 at 12:24
  • Thanks! I am keen to have a more precise/objective question. – Gustavo TA Sep 11 '18 at 12:32
  • you can create a list of the data frames, or two lists the base and the join, have them in the proper order for joining and do a for loop like you have. ` a <- list of data.frames b <- list of data.frames for (i in length(a){ for (j in length(b) { result <- a %>% left_join(b, by whatever) } } ` – MCP_infiltrator Sep 11 '18 at 12:35

1 Answers1

1

Here's an example of how you could build a reproducible example for your situation:

library(tidyverse)
df1a <- data_frame(id = 1:3, var1 = LETTERS[1:3])
df2a <- data_frame(id = 1:3, var1 = LETTERS[4:6])
df1b <- data_frame(id = 1:3, var2 = LETTERS[7:9])
df2b <- data_frame(id = 1:3, var2 = LETTERS[10:12])

list1 <- list(df1a, df2a)
list2 <- list(df1b, df2b)

Now as I understand it you want to do a left_join for df1a and df1b, as well as df2a and df2b. Instead of a loop, you can use map2 from the purrr package. This will iterate over two lists and apply a function to each pair of elements.

map2(list1, list2, left_join)
# [[1]]
# # A tibble: 3 x 3
#        id var1  var2 
#     <int> <chr> <chr>
#   1     1 A     G    
#   2     2 B     H    
#   3     3 C     I    
# 
# [[2]]
# # A tibble: 3 x 3
#        id var1  var2 
#     <int> <chr> <chr>
#   1     1 D     J    
#   2     2 E     K    
#   3     3 F     L 
alexwhan
  • 15,636
  • 5
  • 52
  • 66
  • Thanks, Alex. Maybe, something that is not entire clear from my earlier post is that the two list are only list of names of each data frame. When I try to pass:map2(list1, list2, leftjoin( by=c("Primary Key", "Foreign Key). R will output the following error: Error in UseMethod("left_join") : no applicable method for 'left_join' applied to an object of class "character. Thanks again. – Gustavo TA Sep 11 '18 at 13:38
  • this is why you need to make a minimal, _reproducible_ example. You need to have code in your question that recreates relevant objects to work with, otherwise it's just guess work – alexwhan Sep 11 '18 at 13:53