0

I have 2 data frames in R, df1 and df2.

df1 represents in each row one subject in an experiment. It has 3 columns. The first two columns specify a combination of groups the subject is in. The third column contains the experimental result.

df2 containts values for each combination of groups that can be used for normalization. Thus, it has three columns, two for the groups and a third for the normalization constant.

Now I want to create a fourth column in df1 with the experimental results from the third column, divided by the normalization constant in df2. How can I facilitate this?

Here's an example:

df1 <- data.frame(c(1,1,1,1),c(1,2,1,2),c(10,11,12,13))
df2 <- data.frame(c(1,1,2,2),c(1,2,1,2),c(30,40,50,60))
names(df1)<-c("Group1","Group2","Result")
names(df2)<-c("Group1","Group2","NormalizationConstant")

As result, I need a new column in df1 with c(10/30,11/40,12/30,13/40).

My first attempt is with the following code, which fails for my real data with the error message "In is.na(e1) | is.na(e2) : Length of the longer object is not a multiple of the length of the shorter object". Nevertheless, when I replace the referrer ==df1[,1] and ==df1[,2] with fixed values, it works. Is this really returning only the value of the column for this particular row?

df1$NormalizedResult<- df1$Result / df2[df2[,1]==df1[,1] & df2[,2]==df1[,2],]$NormalizationConstant

Thanks for your help!

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
Chris
  • 721
  • 1
  • 10
  • 23

2 Answers2

1

In this case where the groups are aligned perfectly it's as simple as:

> df1$expnormed <- df1$Result/df2$NormalizationConstant
> df1
  Group1 Group2 Result expnormed
1      1      1     10 0.3333333
2      1      2     11 0.2750000
3      1      1     12 0.2400000
4      1      2     13 0.2166667

If they were not exactly aligned you would use merge:

> dfm <-merge(df1,df2)
> dfm
  Group1 Group2 Result NormalizationConstant
1      1      1     10                    30
2      1      1     12                    30
3      1      2     11                    40
4      1      2     13                    40
> dfm$expnormed <- with(dfm, Result/NormalizationConstant)
IRTFM
  • 258,963
  • 21
  • 364
  • 487
0

A possibility :

df1$res <- df1$Result/df2$NormalizationConstant[match(do.call("paste", df1[1:2]), do.call("paste", df2[1:2]))]
  Group1 Group2 Result       res
1      1      1     10 0.3333333
2      1      2     11 0.2750000
3      1      1     12 0.4000000
4      1      2     13 0.3250000

Hth

droopy
  • 2,788
  • 1
  • 14
  • 12
  • Nice, but in my real data, the columns Group1 and Group2 are not next to each other. If I substitute "df1[1:2]" by "data.frame(df1[1,],df1[3,]", the function returns NA for all values. – Chris Dec 04 '13 at 10:25
  • 1 and 2 are the number of the columns of interest. you can do : do.call("paste", df1[c(1,3)]) or paste(df1[,1], df1[,3]) for instance the both expressions will have the same results. – droopy Dec 04 '13 at 10:39