0

I have the following data.frame, which I want to convert into 2 separate timeseries data frames for revenue and cost.

df1 = data.frame(year = c('2018','2019', '2020','2019','2020','2021'), 
             company=c('x','x','x','y','y','z'),
             revenue=c(45,78,13,89,48,70),
             cost=c(100,120,130,140,160,164),
             stringsAsFactors=FALSE)
df1
  year company revenue cost
1 2018       x      45  100
2 2019       x      78  120
3 2020       x      13  130
4 2019       y      89  140
5 2020       y      48  160
6 2021       z      70  164

If I want to create a new data frame for the revenue data with the data arranged as so, and n.a. to replace all years in which the data is not available, what codes can I use to do this?

          2018    2019    2020   2021
1    x       45     78      13   n.a.
2    y     n.a.     89      48   n.a.
3    z     n.a.   n.a.    n.a.    70

2 Answers2

0

With the tidyverse...

df1 %>% filter(company == 'x') %>% pivot_wider(values_from = revenue, names_from = year)

0

If you are trying to get both revenue and costs as you imply

library(tidyr)
df2 <- pivot_wider(df1, names_from = year, values_from = c(revenue,cost))

gets what you need, I think. Cols 2-5 are the revenues and Cols 6-9 are the costs.

John Garland
  • 483
  • 3
  • 8