1

Is there a way in R to place every three values in the column "V" (below) to new columns? In others words, I need to reshape the data from long to wide, but only to three columns and where the values are what appears in column V. Below is a demonstration.

Thank you in advance!

enter image description here

enter image description here

data = structure(list(Key = c(200, 200, 200, 200, 200, 200, 300, 300, 
                300, 300, 300, 300, 400, 400, 400, 400, 400, 400), 
                V = c("a", "b", "c", "b", "d", "c", "d", "b", "c", "a", "f", "c", "d", "b", 
                "c", "a", "b", "c")), 
                row.names = c(NA, 18L), 
                class = "data.frame")
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68

1 Answers1

1

Here is one option

data %>%
    group_by(Key) %>%
    mutate(
        grp = gl(n() / 3, 3),
        col = c("x", "y", "z")[(row_number() + 2) %% 3 + 1]) %>%
    group_by(Key, grp) %>%
    spread(col, V) %>%
    ungroup() %>%
    select(-grp)
## A tibble: 6 x 4
#    Key x     y     z
#  <dbl> <chr> <chr> <chr>
#1   200 a     b     c
#2   200 b     d     c
#3   300 d     b     c
#4   300 a     f     c
#5   400 d     b     c
#6   400 a     b     c

Note: This assumes that the number of entries per Key is divisible by 3.

Instead of grp = gl(n() / 3, 3) you can also use grp = rep(1:(n() / 3), each = 3).


Update

In response to your comments, let's create sample data by removing some rows from data such that for Key = 200 and Key = 300 we don't have a multiple of 3 V entries.

data2 <- data %>% slice(-c(1, 8))

Then we can do

data2 %>%
    group_by(Key) %>%
    mutate(grp = gl(ceiling(n() / 3), 3)[1:n()]) %>%
    group_by(Key, grp) %>%
    mutate(col = c("x", "y", "z")[1:n()]) %>%
    spread(col, V) %>%
    ungroup() %>%
    select(-grp)
## A tibble: 6 x 4
#    Key x     y     z
#  <dbl> <chr> <chr> <chr>
#1   200 b     c     b
#2   200 d     c     NA
#3   300 d     c     a
#4   300 f     c     NA
#5   400 d     b     c
#6   400 a     b     c

Note how "missing" values are filled with NA.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • @Mauritus Evers, do you have any suggestion if there are repeated rows? In other words, if keys are shared for rows. Error: Each row of output must be identified by a unique combination of keys. Keys are shared for 24 rows: – Eliel Epelbaum May 08 '19 at 08:56
  • @ElielEpelbaum Could you add a minimal example that reproduces the error and also add the expected output? – Maurits Evers May 08 '19 at 09:00
  • @Mauritus Evers, I can't unfortunately. I just realized, answering your previous question, that it's not always a multiple of 3 entries (I thought it was). Could this be the reason for the error? – Eliel Epelbaum May 08 '19 at 09:19
  • @ElielEpelbaum Yes, that would most definitely cause issues; my answer will only work if the number of entries per `Key` is divisible by 3. It shouldn't be difficult to have a different approach for when you don't always have multiple of 3 entries. If you provide some minimal sample data (mock-data works just fine) and expected output I will edit my post to include an alternative solution. – Maurits Evers May 08 '19 at 09:25
  • @ElielEpelbaum I've added an example to address the not-multiple-of-3 situation, based on your sample `data` but with some rows removed. Please take a look. – Maurits Evers May 08 '19 at 10:04
  • @Mauritus Evers, thank you again for your help. The code works perfectly fine and I understand what you did. Thanks! – Eliel Epelbaum May 09 '19 at 15:38