This can be accomplished with pivot_wider()
function from version 1.0.0 or newer of the tidyr
package.
First, we rename the products
column to Products
. Then we use pivot_wider()
arguments to assign column names from the Cat
column of the input data frame, values from the Products
and Sales
columns, use names_glue
to set {.value}_{Cat}
as the column names, and fill missing values with 0.
textData <- "Date Cat Sales products
03-01-2016 A 39 3
04-01-2016 A 28 6
05-01-2016 B 50 4
06-01-2016 D 40 2
07-01-2016 D 39 1
09-01-2016 D 41 5"
data <- read.table(text = textData,header=TRUE)
library(tidyr)
data %>% rename(Products = products) %>%
pivot_wider(.,Date,names_from=Cat,
values_from=c(Sales,Products),
names_glue="{.value}_{Cat}",
values_fill=0)
...and the output:
# A tibble: 6 x 7
Date Sales_A Sales_B Sales_D Products_A Products_B Products_D
<chr> <int> <int> <int> <int> <int> <int>
1 03-01-2016 39 0 0 3 0 0
2 04-01-2016 28 0 0 6 0 0
3 05-01-2016 0 50 0 0 4 0
4 06-01-2016 0 0 40 0 0 2
5 07-01-2016 0 0 39 0 0 1
6 09-01-2016 0 0 41 0 0 5
>
Explaining names_glue
The tidyverse uses the glue package to interpret string literals, evaluate and insert them into argument strings of R functions. tidyr
combines glue with a pivot specification to determine exactly how to name columns when pivoting a data frame from narrow format to wide, or vise versa.
Per the tidyr
documentation, names_glue="{.value}_{Cat}"
is interpreted to generate a specification data frame that controls how columns in the output data frame are named.
The pivot spec is a data frame with one row for each column in the wide format version of the data that is not present in the long format, and two special columns that start with .
: .name
which gives the name of the column, and .value
provides the name of the column that the values in the cells will go into.
For the pivot_wider()
call in this question, the pivot spec data frame looks like this.
# A tibble: 6 x 3
.name .value Cat
<chr> <chr> <chr>
1 Sales_A Sales A
2 Sales_B Sales B
3 Sales_D Sales D
4 Products_A Products A
5 Products_B Products B
6 Products_D Products D
>
NOTE: in the output data frame I did not include blank spaces in the column names as illustrated in the question because they are a hassle to use in R. Also note that the question has a Sales_C
output column when the raw data does not have a C category. Instead, it has a D category.
Also NOTE: tidyr 1.0.0 was introduced in October 2019. tidyr::pivot_wider()
replaces the tidyr::spread()
function, which has been deprecated as of release 1.0.0.