-3

I have two datasets that have a different number of cases, but the same number of variables. E.g., this:

test_data <- data.frame(
  var_1 = rep(1, 10),
  index = letters[1:10]
)

other_data <- data.frame(
  var_1 = c(1, 1, 3, 4, 6, 1),
  index = letters[1:6]
)

And what I need is to replace the values in var_1 in test_data with the values of var_1 in other_data. So the end result would look like this:

> test_data
   var_1 index
1      1     a
2      1     b
3      3     c
4      4     d
5      6     e
6      1     f
7      1     g
8      1     h
9      1     i
10     1     j

I know that dplyr is nice to work with relational data, but I can't figure out whether it's one of the _join function that would do it for me, or something different? Thanks.

Zlo
  • 1,150
  • 2
  • 18
  • 38

2 Answers2

-1

You can use merge:

merged <- merge( test_data, other_data, by = c('index'), all.x = TRUE )
merged$var <- ifelse( is.na( merged$var_1.y ) , merged$var_1.x, merged$var_1.y )
merged[ , c('var', 'index')]
   var index
1    1     a
2    1     b
3    3     c
4    4     d
5    6     e
6    1     f
7    1     g
8    1     h
9    1     i
10   1     j
Adam
  • 756
  • 4
  • 10
  • 23
-1

Just to add another answer: Using Base R and match.

test_data$var_1=other_data$var_1[match(test_data$index, other_data$index)]
test_data[is.na(test_data)] = 1

   var_1 index
1      1     a
2      1     b
3      3     c
4      4     d
5      6     e
6      1     f
7      1     g
8      1     h
9      1     i
10     1     j

This will match values under index to get the value of var_1 from other_data and then will replace the column var_1 from test_data with the resultant values.

NA's will be generated 'cause in index column of test_data are more factors (letters) than in the index of other_data. So then we replace NA values with 1.

Hope it helps.

Cris
  • 787
  • 1
  • 5
  • 19
  • What is the var_1 is not filled with 1? but different numbers? – Robert Oct 23 '17 at 13:41
  • yeah, I think I din't make it clear in my question – the `1`s are a placeholder, there are actually many different numbers. – Zlo Oct 23 '17 at 13:42