0

I am looking to stack a datafile that I have in R and am struggling to find a solution to my problem. I have 49 variables, named i1.1, i1.2, 1.3, i1.4, i1.5, i1.6, i1.7 ... i7.6, i7.7. And I would like to stack all of the i1s together with a new variable specifying the last part of the variable name that it was taken from e.g if the value was taken from i1.3 the new variable would contain the value 3. The other thing I have in my datset is that I wish to also keep an id variable in the dataset so that I am able to merge some additional data in at a later stage. DOes anyone have a code or an idea, how I could create something to do this? I am really stuck. Any suggestions welcome. Thank you in advance.

    > #Call datafile
> testfile <-read_xlsx("C:/Users/rawlingsD/Documents/R Pre Analysis/Test stacking file.xlsx")
> head(testfile)
# A tibble: 3 x 50
  Respondent_ID  i1.1  i1.2  i1.3  i1.4  i1.5  i1.6  i1.7  i2.1  i2.2  i2.3  i2.4  i2.5  i2.6  i2.7  i3.1
          <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1           12.    3.    1.    7.    1.    7.    1.    5.    1.    7.    1.    1.    1.    9.    1.    8.
2           13.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.    2.
3           14.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.    3.
# ... with 34 more variables: i3.2 <dbl>, i3.3 <dbl>, i3.4 <dbl>, i3.5 <dbl>, i3.6 <dbl>, i3.7 <dbl>,
#   i4.1 <dbl>, i4.2 <dbl>, i4.3 <dbl>, i4.4 <dbl>, i4.5 <dbl>, i4.6 <dbl>, i4.7 <dbl>, i5.1 <dbl>,
#   i5.2 <dbl>, i5.3 <dbl>, i5.4 <dbl>, i5.5 <dbl>, i5.6 <dbl>, i5.7 <dbl>, i6.1 <dbl>, i6.2 <dbl>,
#   i6.3 <dbl>, i6.4 <dbl>, i6.5 <dbl>, i6.6 <dbl>, i6.7 <dbl>, i7.1 <dbl>, i7.2 <dbl>, i7.3 <dbl>,
#   i7.4 <dbl>, i7.5 <dbl>, i7.6 <dbl>, i7.7 <dbl>

ANd this is the dataset I would like at the end with i2, i3 etc next to i1:

    > head(testfile_complete)
# A tibble: 6 x 3
  Respondent_ID image    i1
          <dbl> <dbl> <dbl>
1           12.    1.    3.
2           13.    1.    2.
3           14.    1.    3.
4           12.    2.    1.
5           13.    2.    2.
6           14.    2.    3.
Daniel Rawlings
  • 183
  • 1
  • 1
  • 8
  • Possible duplicate: https://stackoverflow.com/questions/19074159/r-use-rbind-on-multiple-variables-with-similar-names – MrFlick May 03 '18 at 19:49
  • But the problem is really creating all those separate variables in the first place. It would have been much easier if you had put related items into a list. – MrFlick May 03 '18 at 19:50
  • This would be much easier to answer if you had a some sample data as well as an example of what you expect you final data to look like. In mean time you could look at the `tidyr` package – GordonShumway May 03 '18 at 19:53
  • Maybe look at answers to this similar-sounding problem https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format (Without a concrete example, you're bound to get people interpreting the problem different ways.) – Frank May 03 '18 at 20:01
  • @GordonShumway, thanks for the feedback. Please find the amended datasets :) Does this make more sense? – Daniel Rawlings May 03 '18 at 20:30
  • @Frank, I effectively want the opposite of the question you suggested :) Does this make sense? – Daniel Rawlings May 03 '18 at 20:31
  • Hm, it does look like `data.table::melt(DF, id = "Respondent_ID", meas = patterns("i1", "i2", "i3"), value.name = c("i1", "i2", "i3"), variable.name = "image")` or something similar. I haven't tested this code since while you now have a concrete example, it's not easily reproducible eg by copy-pasting some code (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250) Sorry about the escalating edit requests. To clarify, you don't need to provide your actual data as long as the example illustrates the problem. – Frank May 03 '18 at 20:36
  • Another link re `melt` with `patterns` (possibly a dupe): https://stackoverflow.com/q/12466493/ – Frank May 03 '18 at 20:39

1 Answers1

1

If you need the columns, then you do: I will only do for i1 and i2

reshape(dat1,varying = t(matrix(names(dat1[-1]),7)),idvar = "Respondent_ID",dir="long")
     Respondent_ID time i1.1 i2.1
12.1            12    1    3    1
13.1            13    1    2    2
14.1            14    1    3    3
12.2            12    2    1    7
13.2            13    2    2    2
14.2            14    2    3    3
12.3            12    3    7    1
13.3            13    3    2    2
14.3            14    3    3    3
12.4            12    4    1    1
13.4            13    4    2    2
14.4            14    4    3    3
12.5            12    5    7    1
13.5            13    5    2    2
14.5            14    5    3    3
12.6            12    6    1    9
13.6            13    6    2    2
14.6            14    6    3    3
12.7            12    7    5    1
13.7            13    7    2    2
14.7            14    7    3    3
Onyambu
  • 67,392
  • 3
  • 24
  • 53