I'm stuck on the following problem in R and was hoping someone had a quick solution.
I have two sets of data, A and B, where A contains data for a control group and B a case group. I have measures for the same variables for each group.
Within A and B are subgroups - and they are in some instances paired between A and B - let's say they are siblings where one or more can be a case and one or more a control.
The data look something like this:
SET A:
Source Area group pch pch2 col col2 group2
R1-1 1983447 1 0 16 1 1 1
R1-3 1400362 1 0 16 1 1 1
R3-4 2834393 2 1 16 2 2 1
R4-2 2232820 3 2 16 3 3 1
R4-5 1713796 3 2 16 3 3 1
R4-6 1525740 3 2 16 3 3 1
R4-7 1182300 3 2 16 3 3 1
SET B:
Source Area group pch pch2 col col2 group2
R1-2 1246124 1 0 16 1 1 2
R3-1 1627610 2 1 16 2 2 2
R3-2 1401600 2 1 16 2 2 2
R4-1 1367146 3 2 16 3 3 2
R4-3 1764125 3 2 16 3 3 2
R4-4 1299864 3 2 16 3 3 2
Source is ID, Area is the variable of interest, group is group, and the rest are additional variables that are not of interest here.
What I'd like to do is calculate relative Area for each of the individuals in set B - i.e., relative to mean Area of their siblings in Set A. I'd like this value to appear as a seperate column in set B (under relArea in sample below). The output would therefore look like this:
Output (Set B):
Source Area group relArea pch pch2 col col2 group2
R1-2 1246124 1 0.736521476 0 16 1 1 2
R3-1 1627610 2 0.574235824 1 16 2 2 2
R3-2 1401600 2 0.494497411 1 16 2 2 2
R4-1 1367146 3 0.821768097 2 16 3 3 2
R4-3 1764125 3 1.06038539 2 16 3 3 2
R4-4 1299864 3 0.781326037 2 16 3 3 2
Finally, if an individual in set B does not have a sibling in set A, then his relArea value would be the Area relative to average Area of all the controls (i.e., all measurements in set A).
Any help with this would be much appreciated.
thanks,
Bjorn