0

I have a time series data frame with two ID columns and about 1000 columns for day 1, day 2, etc. I want to convert from my data frame being in the form

a    b    t1    t2  ... t1000
_____________________________
a1   b1   #     #       #
a2   b2   #     #       #

to being in the form

a    b    t    value
____________________
a1   b1   't1'     #  
a1   b1   't2'     #
a2   b2   't1'     #   
a2   b2   't2'     #

Essentially, I want to do something like this:

dataframe %>%
  select(starts_with("t_") ) %>%
  gather(key = "t", value = "value")

so that I have a dataframe looking like this:

t    value
__________
't1'   #
't2'   #
...
't100' #

for each row in the original dataframe. Then once I have the time columns that I generated from one row in the original dataframe, I want to left append the "a" and "b" columns to each row, so that this:

t    value
__________
't1'   #
't2'   #
...
't100' #

turns into this:

a    b    t    value
____________________
a1   b1   't1'   #
a1   b1   't2'   #
...
a1   b1   't100' #

and then I repeat this process for each row, stacking the new generated dataframe below (or above) the previously generated dataframe. At the end I want to have the dataframe in the code block above generated for each row of the original dataframe, and all stacked on top of one another. I could do this with a for loop, or maybe even some sapply magic, but I don't want to use a for loop in R, and I feel like there's a better way to do this than sapply-ing some function into each row of the original dataframe.

Can anyone help me? Thanks. Preferably using tidyverse.

Saswat Mishra
  • 185
  • 1
  • 11

2 Answers2

1

We can use pivot_longer from tidyr.

library(tidyr)
data <- matrix(1:1000,nrow=2,ncol=1000)
colnames(data) <- paste0("t",1:1000)
data <- data.frame(a=c("a1","a2"),b=c("b1","b2"),data)
data[1:2,1:10]
#   a  b t1 t2 t3 t4 t5 t6 t7 t8
#1 a1 b1  1  3  5  7  9 11 13 15
#2 a2 b2  2  4  6  8 10 12 14 16


data %>% 
   pivot_longer(cols = starts_with("t"), names_to = "t")

## A tibble: 2,000 x 4
#   a     b     t     value
#   <fct> <fct> <chr> <int>
# 1 a1    b1    t1        1
# 2 a1    b1    t2        3
# 3 a1    b1    t3        5
# 4 a1    b1    t4        7
# 5 a1    b1    t5        9
# 6 a1    b1    t6       11
# 7 a1    b1    t7       13
# 8 a1    b1    t8       15
# 9 a1    b1    t9       17
#10 a1    b1    t10      19
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
0

Try this:

library(dplyr)    
dataframe %>%
     gather(key = "t", value = "value", 3:ncol(.))
denisafonin
  • 1,116
  • 1
  • 7
  • 16