1

I have two lists of dataframes, artrk_dist and comprk_dist. I am inner joining the corresponding datasets in the two lists using dplyr. I need to refer to the column names inside the join call using the index of the column rather than the column name itself. I tried using colnames(dataset)[index], but this doesn't work. I have this:

 allrk_dist<- inner_join(artrk_dist[[1]],comprk_dist[[1]], 
                                 by=c(colnames(artrk_dist[[1]])[1]= 
                                        colnames(comprk_dist[[1]])[1],
                                      colnames(artrk_dist[[1]])[2]= 
                                        colnames(comprk_dist[[1]])[2]))
    
    

BUt this gives an error:

Error: unexpected '=' in:
" allrk_dist[[1]]<- inner_join(artrk_dist[[1]],comprk_dist[[1]], 
                                 by=c(colnames(artrk_dist[[1]])[1]="
>                                         colnames(comprk_dist[[1]])[1],
Error: unexpected ',' in "                                        colnames(comprk_dist[[1]])[1],"
>                                       colnames(artrk_dist[[1]])[2]= 
+                                         colnames(comprk_dist[[1]])[2]))
Error: unexpected ')' in:
"                                      colnames(artrk_dist[[1]])[2]= 
                                        colnames(comprk_dist[[1]])[2])"
> 

Here is an example of what I'm doing which gives an error as above:

data1<- data.frame(a1=1:5,b1=6:10)
 data2<- data.frame(a2=2:6,b2=5:9)
 data_list<-list(data1,data2)
 y<- inner_join(data_list[[1]],data_list[[2]],by=colnames(data_list[[1]])[1]
                = colnames(data_list[[2]])[1])

I need to refer to the column names using an index and I am stuck there.

user2450223
  • 235
  • 2
  • 10
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Apr 25 '21 at 17:43
  • Added an example now, don't know how useful it is – user2450223 Apr 25 '21 at 18:01
  • As pointed out in the other question, use `inner_join(data_list[[1]],data_list[[2]],by=setNames(colnames(data_list[[2]])[1], colnames(data_list[[1]])[1]))` – MrFlick Apr 25 '21 at 18:03
  • HOw do I add another index - I need to match on two indices, [1] and [2], is it just a comma-separated list inside the SetNames call? – user2450223 Apr 25 '21 at 18:16
  • Can you make it so your example better represents what you are trying to do? Make it clear what the desired output is. – MrFlick Apr 25 '21 at 18:27

1 Answers1

0

As we are passing an object use setNames

inner_join(data_list[[1]],data_list[[2]],by=setNames(
            colnames(data_list[[2]])[1],colnames(data_list[[1]])[1]))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I actually also have two keys to join on - how would the syntax change in this case? I.e., I have two indexes [1] and [2] I'm joining on. – user2450223 Apr 25 '21 at 18:12
  • 1
    @user2450223 this is based on your example which is working fine for me – akrun Apr 25 '21 at 18:13
  • Sorry, the problem is I have TWO keys or indices to join on - I get a different error now: Error: `suffix` must be a character vector of length 2. suffix is a character vector of length 1. – user2450223 Apr 25 '21 at 18:21
  • 1
    @user2450223 if there are two columns, `setNames(colnames(data_list[[2]])[1:2], colnames(data_list[[1]])[1:2])` – akrun Apr 25 '21 at 18:24
  • @user2450223 your comment is not complete – akrun Apr 25 '21 at 18:24
  • Thank you, what is the general case like with non-contiguous indices? This really helped - seemed like such a small thing but was stuck for hours. – user2450223 Apr 25 '21 at 18:27
  • 1
    @user2450223 use `c` i.e. `[c(1, 3, 5)]` as indexes – akrun Apr 25 '21 at 18:28