how can I make a new data frame where the columns are the unique values with corresponding observations from an old data frame?

Question

My data frame has different dates as rows. Every unique date occurs appr. 500 times. I want to make a new data frame where every column is a unique date and where the rows are all the observations of that date from my old dataset. So for every column dat represents a certain date, I should have appr. 500 rows that each represent a rel_spread from that day.

enter image description here

Do you have the exact same number of entries for each date? – Allan Cameron Mar 19 '22 at 09:38 — Allan Cameron, Mar 19 '22 at 09:38

score 1 · Answer 1 · answered Mar 19 '22 at 09:51

You can use pivot_wider from tidyr:

library(tidyr)

pivot_wider(df, names_from = date, values_from = rel_spread, values_fn = list) %>%
  unnest(everything())
#> # A tibble: 2 x 17
#>   `20000103` `20000104` `20000105` `20000106` `20000107` `20000108` `20000109`
#>        <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
#> 1    -0.0234    -0.0128    0.00729     0.0408    -0.0298     0.0398     0.0445
#> 2     0.0492    -0.0120    0.0277      0.0435    -0.0288     0.0152    -0.0374
#> # ... with 10 more variables: `20000110` <dbl>, `20000111` <dbl>,
#> #   `20000112` <dbl>, `20000113` <dbl>, `20000114` <dbl>, `20000115` <dbl>,
#> #   `20000116` <dbl>, `20000117` <dbl>, `20000118` <dbl>, `20000119` <dbl>

Note that we don't have your data (and I wasn't about to transcribe a picture of your data), but I created a little reproducible data set which should match the structure of your data set, except it only has two values per date for demo purposes:

set.seed(1)
df <- data.frame(date = rep(as.character(20000103:20000119), 2),
                 rel_spread = runif(34, -0.05, 0.05))

df
#>        date    rel_spread
#> 1  20000103 -0.0234491337
#> 2  20000104 -0.0127876100
#> 3  20000105  0.0072853363
#> 4  20000106  0.0408207790
#> 5  20000107 -0.0298318069
#> 6  20000108  0.0398389685
#> 7  20000109  0.0444675269
#> 8  20000110  0.0160797792
#> 9  20000111  0.0129114044
#> 10 20000112 -0.0438213730
#> 11 20000113 -0.0294025425
#> 12 20000114 -0.0323443247
#> 13 20000115  0.0187022847
#> 14 20000116 -0.0115896282
#> 15 20000117  0.0269841420
#> 16 20000118 -0.0002300758
#> 17 20000119  0.0217618508
#> 18 20000103  0.0491906095
#> 19 20000104 -0.0119964821
#> 20 20000105  0.0277445221
#> 21 20000106  0.0434705231
#> 22 20000107 -0.0287857479
#> 23 20000108  0.0151673766
#> 24 20000109 -0.0374444904
#> 25 20000110 -0.0232779331
#> 26 20000111 -0.0113885907
#> 27 20000112 -0.0486609667
#> 28 20000113 -0.0117612043
#> 29 20000114  0.0369690846
#> 30 20000115 -0.0159651003
#> 31 20000116 -0.0017919885
#> 32 20000117  0.0099565825
#> 33 20000118 -0.0006458693
#> 34 20000119 -0.0313782399

thank you so much for the help! Together with Zephryl's comment I got the result I needed — ameliedc, Mar 19 '22 at 11:20

score 1 · Answer 2 · answered Mar 19 '22 at 10:10

Allan’s answer is perfect if you have the same number of rows for each date. If this isn’t the case, the following should work:

library(tidyr)
library(dplyr)

data_wide <- data_long %>%
  group_by(date) %>%
  mutate(daterow = row_number()) %>%
  ungroup() %>%
  pivot_wider(names_from = date, values_from = rel_spread) %>%
  select(!daterow)
  
data_wide

Output:

# A tibble: 6 x 4
  `20000103` `20000104` `20000105` `20000106`
       <dbl>      <dbl>      <dbl>      <dbl>
1     -0.626      0.184     -0.836    -0.621 
2      1.60       0.330     -0.820    -2.21  
3      0.487      0.738      0.576     1.12  
4     -0.305      1.51       0.390    -0.0449
5     NA         NA         NA        -0.0162
6     NA         NA         NA         0.944

Example data:

set.seed(1)

data_long <- data.frame(
  date = c(rep(20000103:20000105, 4), rep(20000106, 6)),
  rel_spread = rnorm(18)
)

I indeed didn't have the same number of rows for each date so this helped me a lot, thank you! — ameliedc, Mar 19 '22 at 11:21

how can I make a new data frame where the columns are the unique values with corresponding observations from an old data frame?

2 Answers2