I have been given a table which I need to transform into two other forms. Can anyone help me find a systematic R-way to convert it without a bunch of nested for loops?
Here's a simplified version of the table:
Status <- rep(c(paste0("B",seq(1,4)),"Total"),3)
FID <- c(rep("N123",10),rep("K541",5))
IID <- c(rep(123,5),rep(456,5),rep(789,5))
Value1.G1 <- c(rep(c(888,345,765,875,875,323),2),8039,830,849)
Value2.G1 <- c(rep(c(443,325,761),4),649,975,323)
Value1.G2 <- rep(c(446,345,765,875,323),3)
Value2.G2 <- c(rep(c(540,345,765),4),169,875,431)
dat <-data.frame(FID,IID,Status,Value1.G1,Value2.G1,Value1.G2,Value2.G2)
print(dat)
FID IID Status Value1.G1 Value2.G1 Value1.G2 Value2.G2
1 N123 123 B1 888 443 446 540
2 N123 123 B2 345 325 345 345
3 N123 123 B3 765 761 765 765
4 N123 123 B4 875 443 875 540
5 N123 123 Total 875 325 323 345
6 N123 456 B1 323 761 446 765
7 N123 456 B2 888 443 345 540
8 N123 456 B3 345 325 765 345
9 N123 456 B4 765 761 875 765
10 N123 456 Total 875 443 323 540
11 K541 789 B1 875 325 446 345
12 K541 789 B2 323 761 345 765
13 K541 789 B3 8039 649 765 169
14 K541 789 B4 830 975 875 875
15 K541 789 Total 849 323 323 431
In brief, excluding the first three columns, the number in each cell is a specific value (in this example Value1
or Value2
) for a specific sample (123
,456
, and 789
), on a specific gel (1
and 2
), at a specific status (B1
,B2
,B3
,B4
,and Total
). The first three columns (FID
,IID
,Status
) list information about the sample (FID
,IID
) and the status (Status
).
First Form
The first form I need to create breaks down the titles into their component parts. (Point of clarification: the "Gel" column refers to the previous numeric value after "G" in the column name.)
FID IID Status Value1 Value2 Gel
1 N123 123 B1 888 443 1
2 N123 456 B1 323 761 1
3 K541 789 B1 875 325 1
4 N123 123 B1 446 540 2
5 N123 456 B1 446 765 2
6 K541 789 B1 446 345 2
7 N123 123 B2 345 325 1
8 N123 456 B2 888 443 1
9 K541 789 B2 323 761 1
10 N123 123 B2 345 345 2
11 N123 456 B2 345 540 2
12 K541 789 B2 345 765 2
13 N123 123 B3 765 761 1
14 N123 456 B3 345 325 1
15 K541 789 B3 8039 649 1
16 N123 123 B3 765 765 2
17 N123 456 B3 765 345 2
18 K541 789 B3 765 169 2
19 N123 123 B4 875 443 1
20 N123 456 B4 765 761 1
21 K541 789 B4 830 975 1
22 N123 123 B4 875 540 2
23 N123 456 B4 875 765 2
24 K541 789 B4 875 875 2
25 N123 123 Total 875 325 1
26 N123 456 Total 875 443 1
27 K541 789 Total 849 323 1
28 N123 123 Total 323 345 2
29 N123 456 Total 323 540 2
30 K541 789 Total 323 431 2
Second Form
The second form I need to create requires that there be a single row for each unique individual id (IID
) containing all of it's value information. In this case, the column names indicate all of the information about the the value and it's specific conditions (i.e. which value, which gel and which status). With this example data, that means that there are 3 rows and 22 columns.
FID IID Value1.G1.B1 Value2.G1.B1 Value1.G2.B1 Value2.G2.B1 Value1.G1.B2 Value2.G1.B2 Value1.G2.B2
1 N123 123 888 443 446 540 345 325 345
2 N123 456 323 761 446 765 888 443 345
3 K541 789 875 325 446 345 323 761 345
Value2.G2.B2 Value1.G1.B3 Value2.G1.B3 Value1.G2.B3 Value2.G2.B3 Value1.G1.B4 Value2.G1.B4 Value1.G2.B4
1 345 765 761 765 765 875 443 875
2 540 345 325 765 345 765 761 875
3 765 8039 649 765 169 830 975 875
Value2.G2.B4 Value1.G1.Total Value2.G1.Total Value1.G2.Total Value2.G2.Total
1 540 875 325 323 345
2 765 875 443 323 540
3 875 849 323 323 431