I have a large dataset (one million rows and ~ 300 columns) in wide format. The dataset contains different metrics such as revenue, cost etc. for multiple products. Unfortunately the dataset comes in wide format. So the variables like revenue or costs are not a single column. Instead there is a column for each products revenue/ costs etc.
For example the columns are called "product1_revenue", "product2_revenue", "product1_costs", "product2_costs" and so on.
I want to transform the dataset into to long format, so I can properly work with it.
I can achieve the transformation for one variable "total_revenue". This works (except the fact, that I cannot keep the id) but I want to this for all other metrics as well.
select(ends_with("_total_revenue")) %>%
gather(key=product,value="total_revenue") %>%
mutate(product=str_replace(product,"_total_revenue",""))
### Trying to keep the IDs does not work:
dataset %>%
select(ends_with("_total_revenue"),id) %>%
gather(key=product,value="total_revenue") %>%
mutate(product=str_replace(product,"_total_revenue",""))
### I want something like this (if it would work of course)
i<-c("_total_revenue","_total_cost")
for(ends_with(colnames(dataset),i) in i)
{
dataset %>%
select(ends_with(!!i),id) %>%
gather(key=product,value=!!i) %>%
mutate(product=str_replace(product,!!i,""))
print(i)
}