wide to long and back to wide again

Question

I am having a Little difficulty with getting my data back to the "original" format.

The data is the following:

       stocks      bills      bonds         gold
1   0.4496607  0.0423607  0.0199607  0.011560700
2  -0.0888480  0.0257520  0.0361520 -0.005848000
3  -0.1872465  0.1094535  0.1093535  0.063953500
4  -0.3452323  0.1162677  0.0675677  0.093167700
5   0.0163397  0.1134397  0.1906397  0.102739700
6   0.4921664  0.0019664  0.0109664  0.439369260
7  -0.0270515 -0.0119515  0.0644485  0.064510328
8   0.4375493 -0.0280507  0.0148493 -0.029850700
9   0.3049072 -0.0127928  0.0357072 -0.014492800
10 -0.3819714 -0.0255714 -0.0147714 -0.028571400
11  0.3205778  0.0285778  0.0698778  0.027777800
12 -0.0110000  0.0004000  0.0441000  0.000000000
13 -0.1138429 -0.0068429  0.0468571 -0.021531637
14 -0.2269908 -0.0984908 -0.1194908 -0.070717428
15  0.1013774 -0.0869226 -0.0674226 -0.090322600
16  0.2210142 -0.0257858 -0.0046858 -0.001806236
17  0.1673115 -0.0191885  0.0028115 -0.029861379
18  0.3357281 -0.0186719  0.0155281  0.004740664
19 -0.2656187 -0.1775187 -0.1500187 -0.154827085
20 -0.0363721 -0.0826721 -0.0791721  0.028684455

Code:

d <- df %>%
  gather(assets, rets, stocks:gold, factor_key=TRUE)

d %>%
  spread(assets, rets)

Error: Duplicate identifiers for rows (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20), (21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40), (41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60), (61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80)

How can I put the data back into the same format as I originally had it?

Data

structure(list(stocks = c(0.4496607, -0.088848, -0.1872465, -0.3452323, 
0.0163397, 0.4921664, -0.0270515, 0.4375493, 0.3049072, -0.3819714, 
0.3205778, -0.011, -0.1138429, -0.2269908, 0.1013774, 0.2210142, 
0.1673115, 0.3357281, -0.2656187, -0.0363721), bills = c(0.0423607, 
0.025752, 0.1094535, 0.1162677, 0.1134397, 0.0019664, -0.0119515, 
-0.0280507, -0.0127928, -0.0255714, 0.0285778, 0.0004, -0.0068429, 
-0.0984908, -0.0869226, -0.0257858, -0.0191885, -0.0186719, -0.1775187, 
-0.0826721), bonds = c(0.0199607, 0.036152, 0.1093535, 0.0675677, 
0.1906397, 0.0109664, 0.0644485, 0.0148493, 0.0357072, -0.0147714, 
0.0698778, 0.0441, 0.0468571, -0.1194908, -0.0674226, -0.0046858, 
0.0028115, 0.0155281, -0.1500187, -0.0791721), gold = c(0.0115607, 
-0.005848, 0.0639535, 0.0931677, 0.1027397, 0.43936926, 0.064510328, 
-0.0298507, -0.0144928, -0.0285714, 0.0277778, 0, -0.021531637, 
-0.070717428, -0.0903226, -0.001806236, -0.029861379, 0.004740664, 
-0.154827085, 0.028684455)), row.names = c(NA, 20L), class = "data.frame")

Thanks I am taking a look at it: Applying `d2 <- d %>% dcast(rets ~ assets)` gets some weird results. — user8959427, Jun 28 '19 at 15:23
This answer is probably what you want: https://stackoverflow.com/a/44511254/5150629 Before `spread`ing your dataset, you need to create an id variable. — acylam, Jun 28 '19 at 15:26
Agree with needing to add an id variable. Think about how your code is working. How does `spread()` know which values go in which rows? When you `gather()` you don't have any indicator left to tell the software how to line things back up. — Adam Sampson, Jun 28 '19 at 15:43

wide to long and back to wide again

0 Answers0