0

I have a column:

Y = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20)

I would like to split into multiple columns, based on the positions of the column values. For instance, I would like:

Y1=c(1,2,3,4,5)
Y2=c(6,7,8,9,10)
Y3=c(11,12,13,14,15)
Y4=c(16,17,18,19,20)

Since I am working with a big data time series set, the divisions will be arbitrary depending on the length of one time period.

DGT
  • 39
  • 9

3 Answers3

1

Not a dplyr solution, but I believe the easiest way would involve using matrices.

foo = function(data, sep.in=5) {
data.matrix = matrix(data,ncol=5)
data.df = as.data.frame(data.matrix)
return(data.df)
}

I have not tested it but this function should create a data.frame who can be merge to a existing one using cbind()

JMenezes
  • 1,004
  • 1
  • 6
  • 13
1

You can use the base split to split this vector into vectors that are each 5 items long. You could also use a variable to store this interval length.

Using rep with each = 5, and creating a sequence programmatically, gets you a sequence of the numbers 1, 2, ... up to the length divided by 5 (in this case, 4), each 5 times consecutively. Then split returns a list of vectors.

It's worth noting that a variety of SO posts will recommend you store similar data in lists such as this, rather than creating multiple variables, so I'm leaving it in list form here.

Y <- 1:20

breaks <- rep(1:(length(Y) / 5), each = 5)
split(Y, breaks)
#> $`1`
#> [1] 1 2 3 4 5
#> 
#> $`2`
#> [1]  6  7  8  9 10
#> 
#> $`3`
#> [1] 11 12 13 14 15
#> 
#> $`4`
#> [1] 16 17 18 19 20

Created on 2019-02-12 by the reprex package (v0.2.1)

camille
  • 16,432
  • 18
  • 38
  • 60
  • Could you please include some code, on how to get the same in multiple variables. I intend to create a moving window, to visualize multiple plots of these divisions. – DGT Feb 12 '19 at 19:58
  • 1
    If you need help with a wider scope of the problem, you should update the question to include more data or more situations that what you initially described – camille Feb 12 '19 at 20:01
0

We can make use of split (writing the commented code as solution) to split the vector into a list of vectors.

lst <- split(Y, as.integer(gl(length(Y), 5, length(Y))))
lst
#$`1`
#[1] 1 2 3 4 5

#$`2`
#[1]  6  7  8  9 10

#$`3`
#[1] 11 12 13 14 15

#$`4`
#[1] 16 17 18 19 20

Here, the gl create a grouping index by specifying the n, k and length parameters where n - an integer giving the number of levels, k - an integer giving the number of replications, and length -an integer giving the length of the result.

In our case, we want to have 'k' as 5.

as.integer(gl(length(Y), 5, length(Y)))
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4

If we want to have multiple objects in the global environment, use list2env

list2env(setNames(lst, paste0("Y", seq_along(lst))), envir = .GlobalEnv)
Y1
#[1] 1 2 3 4 5
Y2
#[1]  6  7  8  9 10
Y3
#[1] 11 12 13 14 15
Y4
#[1] 16 17 18 19 20

Or as the OP mentioned dplyr/tidyr in the question, we can use those packages as well

library(tidyverse)
tibble(Y) %>%
   group_by(grp = (row_number()-1) %/% 5 + 1) %>% 
   summarise(Y = list(Y)) %>%
   pull(Y)
#[[1]]
#[1] 1 2 3 4 5

#[[2]]
#[1]  6  7  8  9 10

#[[3]]
#[1] 11 12 13 14 15

#[[4]]
#[1] 16 17 18 19 20

data

Y <- 1:20
akrun
  • 874,273
  • 37
  • 540
  • 662