I'm looking for some R-code, which produces same output as a proc summary in SAS can do.
I've looked in this thread, which have somewhat similar issue: R: calculating column sums & row sums as an aggregation from a dataframe, but my problem is slightly different, since I don't want to sum across columns, but rather rows.
My data has the structure, which is shown here:
Flag1 Flag2 Flag3 Type1 Type2 Type3
1 Level1 A FIRST 2 0 0
2 Level1 A SECOND 1 9 0
3 Level1 A THIRD 3 7 0
4 Level1 A FOURTH 9 18 0
5 Level1 A FIFTH 1 22 0
6 Level1 A SIXTH 1 13 0
7 Level1 B FIRST 0 0 0
8 Level1 B SECOND 3 9 0
9 Level1 B THIRD 5 85 0
10 Level1 B FOURTH 4 96 0
11 Level1 B FIFTH 3 40 0
12 Level1 B SIXTH 0 17 0
22 Level2 A FIRST 2 0 0
23 Level2 A SECOND 1 9 0
24 Level2 A THIRD 3 7 0
25 Level2 A FOURTH 9 18 0
26 Level2 A FIFTH 1 22 0
27 Level2 A SIXTH 1 13 0
28 Level2 B FIRST 0 0 0
29 Level2 B SECOND 3 9 0
30 Level2 B THIRD 5 85 0
31 Level2 B FOURTH 4 96 0
32 Level2 B FIFTH 3 40 0
33 Level2 B SIXTH 0 17 0
34 Level3 A FIRST 2 0 0
35 Level3 A SECOND 1 9 0
36 Level3 A THIRD 3 7 0
37 Level3 A FOURTH 9 18 0
38 Level3 A FIFTH 1 22 0
39 Level3 A SIXTH 1 13 0
40 Level3 B FIRST 0 0 0
41 Level3 B SECOND 3 9 0
42 Level3 B THIRD 5 85 0
43 Level3 B FOURTH 4 96 0
44 Level3 B FIFTH 3 40 0
45 Level3 B SIXTH 0 17 0
I would like to have the result look something like this, where the Sum could be sum of type1 for example.
Flag1 Flag2 Flag3 Sum
1 Level1 A FIRST
2 Level1 A SECOND
3 Level1 A THIRD
4 Level1 A FOURTH
5 Level1 A FIFTH
6 Level1 A SIXTH
7 Level1 B FIRST
8 Level1 B SECOND
9 Level1 B THIRD
10 Level1 B FOURTH
11 Level1 B FIFTH
12 Level1 B SIXTH
13 Level1 (all) FIRST
14 Level1 (all) SECOND
15 Level1 (all) THIRD
16 Level1 (all) FOURTH
17 Level1 (all) FIFTH
18 Level1 (all) SIXTH
19 Level1 A (all)
20 Level1 B (all)
21 Level1 (all) (all)
22 Level2 A FIRST
23 Level2 A SECOND
24 Level2 A THIRD
25 Level2 A FOURTH
26 Level2 A FIFTH
27 Level2 A SIXTH
28 Level2 B FIRST
29 Level2 B SECOND
30 Level2 B THIRD
31 Level2 B FOURTH
32 Level2 B FIFTH
33 Level2 B SIXTH
34 Level2 (all) FIRST
35 Level2 (all) SECOND
36 Level2 (all) THIRD
37 Level2 (all) FOURTH
38 Level2 (all) FIFTH
39 Level2 (all) SIXTH
40 Level2 A (all)
41 Level2 B (all)
42 Level2 (all) (all)
43 Level3 A FIRST
44 Level3 A SECOND
45 Level3 A THIRD
46 Level3 A FOURTH
47 Level3 A FIFTH
48 Level3 A SIXTH
49 Level3 B FIRST
50 Level3 B SECOND
51 Level3 B THIRD
52 Level3 B FOURTH
53 Level3 B FIFTH
54 Level3 B SIXTH
55 Level3 (all) FIRST
56 Level3 (all) SECOND
57 Level3 (all) THIRD
58 Level3 (all) FOURTH
59 Level3 (all) FIFTH
60 Level3 (all) SIXTH
61 Level3 A (all)
62 Level3 B (all)
63 Level3 (all) (all)
64 (all) A FIRST
65 (all) A SECOND
66 (all) A THIRD
67 (all) A FOURTH
68 (all) A FIFTH
69 (all) A SIXTH
70 (all) B FIRST
71 (all) B SECOND
72 (all) B THIRD
73 (all) B FOURTH
74 (all) B FIFTH
75 (all) B SIXTH
76 (all) A (all)
77 (all) B (all)
78 (all) (all) (all)
In SAS this is easy using something along the lines of:
PROC SUMMARY data=test;
class Flag1 Flag2 Flag3;
var Type1;
output=final_data Sum=sum(Type1);
run;
This will create all the different subtotals in all levels. (And an "nway" statement would only out the lowest level of the data). I can't seem to find any R-code that is as simple as this? Can you please help me. In my real data I have 7 dimensions, which is why a hardcoded way out of it is not an option.
Please ask elaboration question if the above is unclear.
Best Regards