Stacking data file in R

Question

I am looking to stack a datafile that I have in R and am struggling to find a solution to my problem. I have 49 variables, named i1.1, i1.2, 1.3, i1.4, i1.5, i1.6, i1.7 ... i7.6, i7.7. And I would like to stack all of the i1s together with a new variable specifying the last part of the variable name that it was taken from e.g if the value was taken from i1.3 the new variable would contain the value 3. The other thing I have in my datset is that I wish to also keep an id variable in the dataset so that I am able to merge some additional data in at a later stage. DOes anyone have a code or an idea, how I could create something to do this? I am really stuck. Any suggestions welcome. Thank you in advance.

    > #Call datafile
> testfile <-read_xlsx("C:/Users/rawlingsD/Documents/R Pre Analysis/Test stacking file.xlsx")
> head(testfile)
# A tibble: 3 x 50
  Respondent_ID  i1.1  i1.2  i1.3  i1.4  i1.5  i1.6  i1.7  i2.1  i2.2  i2.3  i2.4  i2.5  i2.6  i2.7  i3.1
          <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1           12.    3.    1.    7.    1.    7.    1.    5.    1.    7.    1.    1.    1.    9.    1.    8.
2           13.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.
3           14.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.
# ... with 34 more variables: i3.2 <dbl>, i3.3 <dbl>, i3.4 <dbl>, i3.5 <dbl>, i3.6 <dbl>, i3.7 <dbl>,
#   i4.1 <dbl>, i4.2 <dbl>, i4.3 <dbl>, i4.4 <dbl>, i4.5 <dbl>, i4.6 <dbl>, i4.7 <dbl>, i5.1 <dbl>,
#   i5.2 <dbl>, i5.3 <dbl>, i5.4 <dbl>, i5.5 <dbl>, i5.6 <dbl>, i5.7 <dbl>, i6.1 <dbl>, i6.2 <dbl>,
#   i6.3 <dbl>, i6.4 <dbl>, i6.5 <dbl>, i6.6 <dbl>, i6.7 <dbl>, i7.1 <dbl>, i7.2 <dbl>, i7.3 <dbl>,
#   i7.4 <dbl>, i7.5 <dbl>, i7.6 <dbl>, i7.7 <dbl>

ANd this is the dataset I would like at the end with i2, i3 etc next to i1:

    > head(testfile_complete)
# A tibble: 6 x 3
  Respondent_ID image    i1
          <dbl> <dbl> <dbl>
1           12.    1.    3.
2           13.    1.    2.
3           14.    1.    3.
4           12.    2.    1.
5           13.    2.    2.
6           14.    2.    3.

Possible duplicate: https://stackoverflow.com/questions/19074159/r-use-rbind-on-multiple-variables-with-similar-names — MrFlick, May 03 '18 at 19:49
But the problem is really creating all those separate variables in the first place. It would have been much easier if you had put related items into a list. — MrFlick, May 03 '18 at 19:50
This would be much easier to answer if you had a some sample data as well as an example of what you expect you final data to look like. In mean time you could look at the `tidyr` package — GordonShumway, May 03 '18 at 19:53
Maybe look at answers to this similar-sounding problem https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format (Without a concrete example, you're bound to get people interpreting the problem different ways.) — Frank, May 03 '18 at 20:01
@GordonShumway, thanks for the feedback. Please find the amended datasets :) Does this make more sense? — Daniel Rawlings, May 03 '18 at 20:30
@Frank, I effectively want the opposite of the question you suggested :) Does this make sense? — Daniel Rawlings, May 03 '18 at 20:31
Hm, it does look like `data.table::melt(DF, id = "Respondent_ID", meas = patterns("i1", "i2", "i3"), value.name = c("i1", "i2", "i3"), variable.name = "image")` or something similar. I haven't tested this code since while you now have a concrete example, it's not easily reproducible eg by copy-pasting some code (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250) Sorry about the escalating edit requests. To clarify, you don't need to provide your actual data as long as the example illustrates the problem. — Frank, May 03 '18 at 20:36
Another link re `melt` with `patterns` (possibly a dupe): https://stackoverflow.com/q/12466493/ — Frank, May 03 '18 at 20:39

Onyambu · Accepted Answer · 2018-05-04T15:11:35.417

1

If you need the columns, then you do: I will only do for i1 and i2

reshape(dat1,varying = t(matrix(names(dat1[-1]),7)),idvar = "Respondent_ID",dir="long")
     Respondent_ID time i1.1 i2.1
12.1            12    1    3    1
13.1            13    1    2    2
14.1            14    1    3    3
12.2            12    2    1    7
13.2            13    2    2    2
14.2            14    2    3    3
12.3            12    3    7    1
13.3            13    3    2    2
14.3            14    3    3    3
12.4            12    4    1    1
13.4            13    4    2    2
14.4            14    4    3    3
12.5            12    5    7    1
13.5            13    5    2    2
14.5            14    5    3    3
12.6            12    6    1    9
13.6            13    6    2    2
14.6            14    6    3    3
12.7            12    7    5    1
13.7            13    7    2    2
14.7            14    7    3    3

edited May 04 '18 at 15:11

answered May 03 '18 at 20:57

Onyambu

67,392
3
24
53

Thank you!!! This is great, but how do I seperate the i2, i3, i4 out seperately as additional stacked variables? :) – Daniel Rawlings May 04 '18 at 06:28
That is not what the question asked. Give me a moment – Onyambu May 04 '18 at 06:34
Apologies, this is what I meant by this comment, "ANd this is the dataset I would like at the end with i2, i3 etc next to i1:" – Daniel Rawlings May 04 '18 at 15:07
Is the results not what you want? The last results? – Onyambu May 04 '18 at 15:09

Stacking data file in R

1 Answers1