0

I have two dataframes df_a with two columns colA, colB, and df_b with one column colA.

df_a <- data.frame(colA = sample(1:10, 10), colB = sample(LETTERS[1:20],10))

> df_a
   colA colB
1     2    F
2     8    J
3     5    G
4     9    A
5    10    R
6     4    N
7     7    D
8     1    B
9     3    Q
10    6    H

df_b <- data.frame(colA = sample(1:10, 10))
> df_b
   colA
1     9
2     5
3     3
4     7
5     1
6     8
7     2
8     4
9    10
10    6

I have to create a new column colB in df_b after comparing colA of df_a with colA of df_b.

> df_b$colB <- df_a[df_a$colA %in% df_b$colA,'colB']

> df_b
   colA colB
1     9    F
2     5    J
3     3    G
4     7    A
5     1    R
6     8    N
7     2    D
8     4    B
9    10    Q
10    6    H

The corresponding values in both dataframes are not the same. For example in df_a, colA value 9 has A in colB. Whereas in df_b, colA value 9 has F in colB. Is this issue due to unsorted dataframes ?

Note: I couldn't find a similar question and this even might be a possible duplicate. I would like to understand the root cause of the error.

Original task was to populate values for replacing NA in df_b.

df_a <- data.frame(colA = sample(1:10, 10), colB = sample(LETTERS[1:10],10))
df_b <- data.frame(colA = sample(1:10, 10), colB = sample(c(LETTERS[1:10], 'NA'),10))
Prradep
  • 5,506
  • 5
  • 43
  • 84

0 Answers0