Apologies if this is a simple issue. I have data which is tidy (long) formatted. I wish to see what the differences in the set of values in Factor Name
are for each sample in Sample Name
. I believe its possible with the group_by function.
# Groups: Sample Name
`Sample Name` `Factor Name` mean
<fct> <fct> <dbl>
1 S1 ABCD -5.15
2 S1 EFGH 7.74
3 S1 IJKL -7.43
4 S2 ABCD 4.35
5 S2 EFGH -2.15
6 S2 IJKL 2.33
7 S3 ABCD 5.53
8 S3 EFGH 2.84
9 S3 IJKL 1.61
10 S3 MNOP NaN
I've also tried aggregate and while it gives an output I would prefer a group_by or pipe efficient method.
Aggregate(`Factor Name` ~ `Sample Name`, df, FUN= function(x) setdiff(unique(df$`Factor Name`),x))
Also if possible I would like to be able to add the missing Factor Name
for each Sample name like so:
# Groups: Sample Name
`Sample Name` `Factor Name` mean
<fct> <fct> <dbl>
1 S1 ABCD -5.15
2 S1 EFGH 7.74
3 S1 IJKL -7.43
4 S1 MNOP NaN
5 S2 ABCD 4.35
6 S2 EFGH -2.15
7 S2 IJKL 2.33
8 S2 MNOP NaN
9 S3 ABCD 5.53
10 S3 EFGH 2.84
11 S3 IJKL 1.61
12 S3 MNOP NaN