0

I regularly need to transfer data between dataframes. Often the dataframe from where the data comes from is a smaller subset of the dataframe where the data is going to.

Lets say I have this dataframe:

df <- data.frame(ID = c(1,3,6,9), variable = c(-0.1, 0, 0, 0.1))

  ID variable
1  1     -0.1
2  3      0.0
3  6      0.0
4  9      0.1

I need to transfer variable from df to sleep, but only at rows where ID is the same in both df and sleep.

To do this, I normally use a for loop like this:

    sleep$variable <- NA
       for (i in seq_along(sleep$ID)) {
         x <- which(sleep$ID  == df$ID[i])
         sleep$variable[x] <- df$variable[i]
       }

sleep

   extra group ID variable
1    0.7     1  1     -0.1
2   -1.6     1  2       NA
3   -0.2     1  3      0.0
4   -1.2     1  4       NA
5   -0.1     1  5       NA
6    3.4     1  6      0.0
7    3.7     1  7       NA
8    0.8     1  8       NA
9    0.0     1  9      0.1
10   2.0     1 10       NA
11   1.9     2  1     -0.1
12   0.8     2  2       NA
13   1.1     2  3      0.0
14   0.1     2  4       NA
15  -0.1     2  5       NA
16   4.4     2  6      0.0
17   5.5     2  7       NA
18   1.6     2  8       NA
19   4.6     2  9      0.1
20   3.4     2 10       NA

I'm looking for a function that will achieve the same result, but requires less code. Ideally, I would like the function to take just 3 arguments: the vector where the data is coming from, the vector where the data is going to and the vector used to match rows in the two dataframes.

Is there such a function currently available in R? Alternatively, can anyone provide such a function?

luciano
  • 13,158
  • 36
  • 90
  • 130
  • Did you look at `merge()`? – Andy Clifton Apr 01 '14 at 18:05
  • There are many different ways to do this in R. This [question/answer](http://stackoverflow.com/questions/4322219/whats-the-fastest-way-to-merge-join-data-frames-in-r) has a great comparison of the various methods. – jlhoward Apr 01 '14 at 21:36

1 Answers1

1

How about match:

sleep <- data.frame(extra = runif(100), group = rep(1:10, each = 10), ID = rep(1:10, times = 10))

sleep$variable <- df$variable[match(sleep$ID, df$ID)]

This requires four arguments (ID is repeated, arguably unnecessarily).

orizon
  • 3,159
  • 3
  • 25
  • 30