r reshape/pivot_wide part of dataframe in long form

Question

I have a dataset which looks like this:

## # A tibble: 15 × 7
##    participant.code round Index name           secs_since_round_s…   delta  data
##    &lt;fct&gt;            &lt;int&gt; &lt;int&gt; &lt;chr&gt;                        &lt;dbl&gt;   &lt;dbl&gt; &lt;dbl&gt;
##  1 0ey3b4tb             1    20 &quot;&quot;                              NA  -3.31  170. 
##  2 0ey3b4tb             1    21 &quot;&quot;                              NA  -0.127 170. 
##  3 0ey3b4tb             1    22 &quot;&quot;                              NA   6.72  177. 
##  4 0ey3b4tb             1    23 &quot;&quot;                              NA  13.1   190. 
##  5 0ey3b4tb             1    24 &quot;Trade_ends&quot;                   126   3.90  194. 
##  6 0ey3b4tb             1    25 &quot;&quot;                              NA   4.98  199. 
##  7 0ey3b4tb             2    26 &quot;Trade_starts&quot;                   0   9.18  109. 
##  8 0ey3b4tb             2    27 &quot;slider value…                   9   2.91  112. 
##  9 0ey3b4tb             2    27 &quot;show confirm…                   9   2.91  112. 
## 10 0ey3b4tb             2    27 &quot;Continue kee…                  11   2.91  112. 
## 11 0ey3b4tb             2    27 &quot;slider value…                  13   2.91  112. 
## 12 0ey3b4tb             2    28 &quot;slider value…                  16   3.41  115. 
## 13 0ey3b4tb             2    28 &quot;slider value…                  18   3.41  115. 
## 14 0ey3b4tb             2    29 &quot;sharpChangeD…                  26  11.3   127. 
## 15 0ey3b4tb             2    30 &quot;&quot;                              NA -78.9    47.9

This is a panel data df in long format, grouped by participant.code, round and Index of events within each round, but it has complicated structure: some events took place at the same time within rounds, so they have the same Index but different values associated with these events. Events are recorded in name field, e.g. four events took place at Index value 27, and two at Index value 28. Values corresponding to these events are recorded in several separate columns, such as secs_since_round_s... and others (not in the tibble). I want to keep this long format structure, but

keep the last of all repeated observations with the original Index within each group by participant.code and round,
reshape all previous observations with the same Index to wide format in the same row as this last observation, with variable name taken from the corresponding name and variable name, and value - from the relevant value in that variable. E.g., interaction of Continue kee..._secs_since_round_s. should have value 11 in line 11, and slider value..._secs_since_round_s.. should have value 16 in line 13, so that the resulting df looks like this:

## # A tibble: 15 × 7
##    participant.code round Index name           secs_since_round_s…   delta  data    slider value…    show confirm…       Continue kee…_secs_since_round_s…   slider_value…_secs_since_round_s… 
##    <fct>            <int> <int> <chr>      <dbl>             <dbl>    <dbl>   <dbl>          <dbl>                 <dbl>                           <dbl> 
##  1 0ey3b4tb             1    20 ""                              NA  -3.31  170. 
##  2 0ey3b4tb             1    21 ""                              NA  -0.127 170. 
##  3 0ey3b4tb             1    22 ""                              NA   6.72  177. 
##  4 0ey3b4tb             1    23 ""                              NA  13.1   190. 
##  5 0ey3b4tb             1    24 "Trade_ends"                   126   3.90  194. 
##  6 0ey3b4tb             1    25 ""                              NA   4.98  199. 
##  7 0ey3b4tb             2    26 "Trade_starts"                   0   9.18  109. 
## 11 0ey3b4tb             2    27 "slider value…                  13   2.91  112.  some_value_here some_value_here             11
## 13 0ey3b4tb             2    28 "slider value…                  18   3.41  115.  some_value_here                                     16
## 14 0ey3b4tb             2    29 "sharpChangeD…                  26  11.3   127. 
## 15 0ey3b4tb             2    30 ""                              NA -78.9    47.9

This looks like a task for pivot_wide, but the data structure is rather complex. Any suggestions how to do that?

EDIT Apparently this problem is not trivial at all to handle using pre-built libraries, so eventually I used a 'brute force' solution, and reshaped my dataset using an ad hoc if-for loop. Generally, this is not an r'ish way to proceed - but conventional techniques have their natural limitations, and it looks like the problem in question is an instance of that class.

Could you provide a helpful way to recreate your data ourselves? https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Harrison Jones, Nov 17 '21 at 16:41
Sure - here is the sample data file: https://github.com/albelix/datasample — Alexis Belianin, Nov 17 '21 at 17:32

r reshape/pivot_wide part of dataframe in long form

0 Answers0