How to move from wide to long format with groups of measurement variables in one case (not single variables)?

Question

I have a dataset in a wide format: each case corresponds to one person. Each person has a number of measurements, and these measurements belong together: There are nine time points and different things are measured at each of these time points.

Let me illustrate the structure of my dataset (called "data"):

PID     T1_M1     T1_M2      T2_M1      T2_M2
1       VW        green      Audi       red
2       Audi      red        Mercedes   yellow
3       VW        yellow     Ferrari    blue
4       Tesla     red        Fiat       red
5       Cadillac  blue       Tesla      yellow
6       Toyota    white      Opel       blue
7       Opel      brown      VW         white
8       Citroen   orange     Audi       green
9       Peugeot   green      Citroen    orange

Now I would like to bring the dataset into a "long" format. However, with conventional functions such as pivot_longer (from the tidyr package), only a number of columns are given and for each observation in each of these columns, a new case is created.

The R line ...

data_long<- pivot_longer(data, starts_with("T"), names_to = "variable", values_to = "values")

results in the following type of dataframe: enter image description here

Each measurement is one case - but this is not what I want. I would like to have this structure:

PID     M1        M2      T
1       VW        green   1   
1       Audi      red     2
2       Audi      red     1
2       Mercedes  yellow  2
3       VW        yellow  1
3       Ferrari   blue    2
4       Tesla     red     1
4       Fiat      red     2
5       Cadillac  blue    1
5       Tesla     yellow  2
6       Toyota    white   1
6       Opel      blue    2
7       Opel      brown   1
7       VW        white   2
8       Citroen   orange  1
8       Audi      green   2
9       Peugeot   green   1
9       Citroen   orange  2

Both linked threads deal with cases in which one single value should added to a case. I want multiple values of multiple variables in one case (but not all, compared to the wide version of a data_set)

However, I would like a case to be created only for a group of connected measurements at a point in time, not for each individual measurement in each of the variables.

What would be a good solution for this? I'm a bit lost.

Please reopen this question! The linked posts do not show, how to deal with sets of measurement values to be combined in one case (not all values (= wide format), but also not only one value). — matclou, Mar 03 '23 at 13:29
@matclou even if not a duplicate, your question doesn’t include the information we would need to help you effectively — ie, a [mre] including example data, example desired output, and any code you’ve tried already. Please have a look at [How to make a great R reproducible example](https://stackoverflow.com/q/5963269/17303805). — zephryl, Mar 03 '23 at 13:50
You may want to make use of the `".value"` sentinel and `names_sep` / `names_pattern` args of `pivot_longer()`, but hard to say more without a reprex. I suggest reading [the docs](https://tidyr.tidyverse.org/reference/pivot_longer.html) for `pivot_longer()`, especially the examples at the end. — zephryl, Mar 03 '23 at 13:54
@zephryl: I will try to add an example here to make it more illustrative, potentially tommorrow. — matclou, Mar 03 '23 at 17:25
Hi! @zephry l added further examples that to show, what I mean exactly. I think it is distinctive from the other topics - so maybe someone could reopen the thread ;-) — matclou, Mar 09 '23 at 09:44
Thanks for adding. If you adapt the code from the second linked duplicate, you get essentially what you want: `pivot_longer(data, starts_with("T"), names_to = c("T", ".value"), names_sep = "_")`. Or you can use `names_pattern` and `names_transform` to make the `"T"` column numeric: `pivot_longer(data, starts_with("T"), names_to = c("T", ".value"), names_pattern = "T(\\d)_(.+)", names_transform = type.convert)`. — zephryl, Mar 09 '23 at 10:47
Thank you very much for your help! This really helped and was the thing I was looking for :-) (even though the question did not reopen ;-) ) — matclou, Mar 17 '23 at 18:25

How to move from wide to long format with groups of measurement variables in one case (not single variables)?

0 Answers0