0

I have two large data frames, say df1 and df2. I want to add one value, given certain conditions, to df2.

df2 contains pepole - ie:

> df2
  ID Age Gender
1  1  20      M
2  2  30      K
3  3  40      M
4  4  60      M

Then i have df1 which contains lots of infmation, one number for each age from 0 to 100 and for each gender. I need to find the korrekt number for my person. Say i look up age 30 i get

> df1[df1[,'V10'] == 30,]
    V5 V10  V12
31   K  30 5.91
151  M  30 5.33

In df2 i want the value 5.91 since my persen is female. Like this

> df1[df1[,'V10'] == 30 &
+       df1[,'V5'] == 'K', 'V12']
[1] 5.91

First I tried this in a for-loop, but it takes too long. Then i tried with()

df2$number <- with(df2, df1[df1[,'V10'] == Age &
                              df1[,'V5'] == Gender, 'V12'])

but i only gives me one value for all of them.

Hope some of you understand my problem, and can help me. My main problem is that run time is too long.

Sotos
  • 51,121
  • 6
  • 32
  • 66

1 Answers1

0

As it was stated in a comment, you could use merge to get the value of V12 for each gender and age.

I created some dummy data to show how it could be done.

df2 <- data.frame(ID = 1:6,
              Age = c(20,30,40, 40, 60, 60),
              Gender = c("M","K","M", "K", "M", "K"))

df1 <- data.frame(V5 = rep(c("K", "M"), 41),
              V10 = rep(20:60, each= 2),
              V12 = runif(82, 1,10))

df <- merge(df2, df1, by.x=c("Age", "Gender"), by.y=c("V10", "V5"))

With the merge() function, you can merge on more than one column by specifying the column names within the by argument.

by.x its used for the columns of the first data.frame used as input (df2 in this case) and by.y for the column names of df1.

If you prefer, you could use left_join from the dplyr package. You can check this stackoverflow post.

csmontt
  • 614
  • 8
  • 15