I have two datasets which both share a common ID variable, and also share n variables which are denoted SNP1-SNPn. An example of the two datasets is shown below
Dataset 1
ID SNP1 SNP2 SNP3 SNP4 SNP5 SNP6 SNP7
1 0 1 1 0 0 0 0
2 1 1 0 0 0 0 0
3 1 0 0 0 1 1 0
4 0 1 1 0 0 0 0
5 1 0 0 0 1 1 0
6 1 0 0 0 1 1 0
7 0 1 1 0 0 0 0
Dataset 2
ID SNP1 SNP2 SNP3 SNP4 SNP5 SNP6 SNP7
1 0.65 1.3 2.8 0.43 0.62 0.9 1.5
2 0.74 1.6 3.4 0.9 2.4 4.4 2.3
3 0.28 0.5 5.7 6.7 0.3 2.5 0.56
4 0.74 1.6 3.4 0.9 2.4 4.4 2.3
5 0.65 1.3 2.8 0.43 0.62 0.9 1.5
6 0.74 1.6 3.4 0.9 2.4 4.4 2.3
7 0.28 0.5 5.7 6.7 0.3 2.5 0.56
I would like to multiply each value in a given position in dataframe 1, with the value in the equivalent position in dataframe 2.
For example, I would like to multiple position [1,2] in dataset 1 (value = 0), by position [1,2] in dataset 2 (value = 0.65). My data set is very large and spans almost 300 columns and 500,000 IDs.
Variable names for SNP1-n are longer in reality (for example they actually read Affx.5869593), so I cannot just use SNP1-300 in my code, it would have to be specified by the number of columns.
Do I need to unlist both datasets by person ID and SNP name first? What function can be used for multiplying values within two datasets?