R pivot_longer(): tidyr wide to long manipulation reverse pivot summary to individual values

Question

I am trying to manipulate a wide table which represents the per cent of each household composition type within two towns to a long-form table (basically, a reverse of a pivot table).

In the long table, I would like each row to represent the household composition value for one household. So, the number of rows for each combination depends on the values provided e.g. 18 rows of (town.a, singles), 8 rows of (town.b, singles etc.). However, I just can't seem to figure out how to do this expansion based on the values in each Town column.

I have a data.frame() that looks like this:

household.data <- data.frame(household.composition= c("Singles","Couples", "Families", "Single Parents", "Sharers"),
                             town.a =c(18,29,41,3,3),
                             town.b =c(8,37,48,9,3))

The values under the Town A and Town B columns represent the per cent makeup of each household composition within each town.

The goal is to get from this wide summary format to a long format which multiplies the value in the Household Composition column by the numeric value within the Town A and Town B columns. So each row would represent the household composition value for one household. For example:

Again, I know that there must be a way to do this using the spread/gather or pivot function in tidyR. However, I just can't seem to figure out how to do this expansion given that I would like the number of rows to correspond with the per cent value.

Do you want there to be for example, 18 rows of (town.a, singles), 8 rows of (town.b, singles etc.? So, the number of rows for each combination depends on the values provided? — HNSKD, Apr 29 '20 at 03:59

score 4 · Accepted Answer · answered Apr 29 '20 at 04:26

You can get the data in long format and use uncount to replicate rows.

library(tidyr)
pivot_longer(household.data, cols = -household.composition) %>% uncount(value)

# A tibble: 199 x 2
#   household.composition name  
#   <chr>                 <chr> 
# 1 Singles               town.a
# 2 Singles               town.a
# 3 Singles               town.a
# 4 Singles               town.a
# 5 Singles               town.a
# 6 Singles               town.a
# 7 Singles               town.a
# 8 Singles               town.a
# 9 Singles               town.a
#10 Singles               town.a
# … with 189 more rows

Oh what a great function, I haven't come across uncount() thanks — CarlaBirdy, Apr 29 '20 at 05:46

score 2 · Answer 2 · edited Apr 29 '20 at 06:49

2

You can work as follows:

Convert the data from wide to long format using tidyr::pivot_longer
Use lapply to apply the rep-licate function based on the number of times in value
Since lapply gives results as list, use dplyr::bind_rows to bind them into a dataframe
Remove the value column to get the desired output

library(dplyr)
library(tidyr)
 household.data %>% 
   pivot_longer(-household.composition, names_to = "town") %>% 
   lapply(rep, .$value) %>% 
   bind_rows() %>%
   select(-value)

edited Apr 29 '20 at 06:49

UseR10085

7,120
3
24
54

answered Apr 29 '20 at 04:13

HNSKD

1,614
2
14
25

Works perfectly, thank you @HNSKD :) – CarlaBirdy Apr 29 '20 at 04:16
2

You can save a line with `purrr::map_df` in place of `lapply` if you want. – Ian Campbell Apr 29 '20 at 04:20

score 1 · Answer 3 · answered Apr 29 '20 at 04:33

Base R solution:

setNames(within(
  reshape(
    household.data,
    direction = "long",
    varying = grepl("town", names(household.data)),
    timevar = "town_type",
    times = NULL,
    idvar = !(grepl("town", names(household.data))),
    new.row.names = 1:(nrow(household.data) * length(grepl(
      "town", names(household.data)
    )))
  ),
  {
    rm(town)
  }
), c("household.composition", "town"))

Frank Zhang · Answer 4 · 2020-04-29T06:30:40.290

1

data.table solution

library(data.table)
melt(setDT(household.data),id.vars = "household.composition")[rep(1:.N,value),.( household.composition,variable)]

edited Apr 29 '20 at 06:30

answered Apr 29 '20 at 06:03

Frank Zhang

1,670
7
14

R pivot_longer(): tidyr wide to long manipulation reverse pivot summary to individual values

4 Answers4