[Still new to this, let me know if I need to provide any more information to help with my problem]
Trying to help a labmate with some data analysis, but the column names in the data compound several variables.
In his data, each row represents a mean value. each column is named mean.x.y.z, where x, y, & z, refer to different values of different variables. What we need to do is split each column name into its composite parts, and create separate columns for each of those.
for a more concrete example, if this is the data we were provided:
messy <- tibble(
id = c('P1', 'P2', 'P3'),
value.neutral.up = c(1.2, 1.3, 1.2),
value.neutral.down = c(2.1, 3.1, 2.1),
value.valid.up = c(1.2, 1.3, 1.2),
value.valid.down = c(2.1, 3.1, 2.1),
value.invalid.up = c(1.2, 1.3, 1.2),
value.invalid.down = c(2.1, 3.1, 2.1)
)
messy
# A tibble: 3 x 7
id value.neutral.up value.neutral.down value.valid.up value.valid.down value.invalid.up value.invalid.down
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 P1 1.2 2.1 1.2 2.1 1.2 2.1
2 P2 1.3 3.1 1.3 3.1 1.3 3.1
3 P3 1.2 2.1 1.2 2.1 1.2 2.1
Value refers to the numeric value that we want to extract, neutral/valid/invalid & up/down refer to levels of two predictor variables.
With that in mind, our end goal is something that looks like:
tidy <- tibble(
id = c( rep('P1', 6), rep('P2', 6), rep('P3', 6)),
cue = rep(c('neutral', 'valid', 'invalid'),6),
direction = rep(c('up', 'down'), 9),
value = rep(c(1.2, 1.3, 1.4, 2.1, 3.1, 4.1), 3)
)
tidy
# A tibble: 18 x 4
id cue direction value
<chr> <chr> <chr> <dbl>
1 P1 neutral up 1.2
2 P1 valid down 1.3
3 P1 invalid up 1.4
4 P1 neutral down 2.1
5 P1 valid up 3.1
6 P1 invalid down 4.1
7 P2 neutral up 1.2
8 P2 valid down 1.3
9 P2 invalid up 1.4
10 P2 neutral down 2.1
11 P2 valid up 3.1
12 P2 invalid down 4.1
13 P3 neutral up 1.2
14 P3 valid down 1.3
15 P3 invalid up 1.4
16 P3 neutral down 2.1
17 P3 valid up 3.1
18 P3 invalid down 4.1
I've used reshape2::gather() for this kind of problem before, but only in cases where column names didn't compound more than one variable name. That is, if the columns were value.neutral, value.valid, etc, that would be fine, but I don't know how to deal with the additional .up/down bit.
Any recommendations for how to handle this? Even if somebody could help me to better phrase this problem (in searchable terms) I'd be appreciative.
Thanks folks!