1

I have two data frames, df1 and df2. df1 contains positions and gene names assigned to those positions e.g.

Name    V1 
Gene_1  +1110
Gene_2  +2600
Gene_3  +3600
Gene_4  -2600
Gene_5  -4000
Gene_6  -3500
Gene_7  +2900
.....

Whereas df2 just contains the V1 column.

V1
+1110
+3600
-4000
-3500
+2900
....
+6000
-7000
....

What I am wanting to do is to loop through df1 so I am able to extract the matching 'Name' to 'V1' and output it df2 in a new column. Does anyone know how to do this? My output should look like this:

V1.    Name
+1110  Gene_1
+3600  Gene_3
-4000  Gene_5
-3500  Gene_6
+2900  Gene_7
....   ......
+6000  Gene_13
-7000  Gene_16
....   ......
margo
  • 23
  • 4

2 Answers2

2

Seems like you want a join:

merge(df2,df1, by="V1")

Output:

      V1   Name
1: -4000 Gene_5
2: -3500 Gene_6
3:  1110 Gene_1
4:  2900 Gene_7
5:  3600 Gene_3

Input:

df1 = structure(list(Name = c("Gene_1", "Gene_2", "Gene_3", "Gene_4", 
"Gene_5", "Gene_6", "Gene_7"), V1 = c(1110L, 2600L, 3600L, -2600L, 
-4000L, -3500L, 2900L)), row.names = c(NA, -7L), class = "data.frame")


df2 = structure(list(V1 = c(1110L, 3600L, -4000L, -3500L, 2900L, 6000L, 
-7000L)), row.names = c(NA, -7L), class = "data.frame")
langtang
  • 22,248
  • 1
  • 12
  • 27
1

We colud do a right_join:

right_join() return all rows from y, and all columns from x and y. Rows in y with no match in x will have NA values in the new columns. If there are multiple matches between x and y, all combinations of the matches are returned:

library(dplyr)

df2 %>% 
  right_join(df1, by="V1")
     V1   Name
1  1110 Gene_1
2  3600 Gene_3
3 -4000 Gene_5
4 -3500 Gene_6
5  2900 Gene_7
6  2600 Gene_2
7 -2600 Gene_4
TarJae
  • 72,363
  • 6
  • 19
  • 66