2

I have data frame df_wide for company data in a wide format

    df_wide <- data.frame(Company=c('CompanyA','CompanyB', 'CompanyC'),
             Industry=c('Manufacturing', 'Telecom', 'Services'),
             Sales.2015=c('100', '500', '1000'), 
             Sales.2016=c('110', '550', '1100'), 
             Sales.2017=c('120', '600', '1200'),
             EBITDA.2015=c('10', '50', '100'), 
             EBITDA.2016=c('11', '55', '110'),
             EBITDA.2017=c('12', '60', '120'))

        Company      Industry Sales.2015 Sales.2016 Sales.2017 EBITDA.2015 EBITDA.2016 EBITDA.2017
    1 CompanyA Manufacturing        100        110        120          10          11          12
    2 CompanyB       Telecom        500        550        600          50          55          60
    3 CompanyC      Services       1000       1100       1200         100         110         120

And I wish to transform the data into a long format like df_long

    df_long <- data.frame(Company=c('CompanyA', 'CompanyA', 'CompanyA', 'CompanyB', 'CompanyB','CompanyB','CompanyC','CompanyC', 'CompanyC'),
              Industry=c('Manufacturing','Manufacturing','Manufacturing','Telecom','Telecom','Telecom','Services','Services','Services'),
              Year=c('2015','2016','2017','2015','2016','2017','2015','2016','2017'),
              Sales=c('100','110','120','500', '550','600','1000','1100','1200'),
              EBITDA=c('10','11','12','50','55','60','100','110','120'))

       Company      Industry Year Sales EBITDA
    1 CompanyA Manufacturing 2015   100     10
    2 CompanyA Manufacturing 2016   110     11
    3 CompanyA Manufacturing 2017   120     12
    4 CompanyB       Telecom 2015   500     50
    5 CompanyB       Telecom 2016   550     55
    6 CompanyB       Telecom 2017   600     60
    7 CompanyC      Services 2015  1000    100
    8 CompanyC      Services 2016  1100    110
    9 CompanyC      Services 2017  1200    120

I have tried with pivot_longer and it works fine with just one variable but struggling when trying to pivot both Sales and EBITDA.

    df_long2 <- df_wide %>% pivot_longer(cols = starts_with("Sales"),
                                 names_to = "Year",
                                 values_to = "Sales")
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Nic
  • 79
  • 6

4 Answers4

5

Using pivot_longer

tidyr::pivot_longer(df_wide, 
                   cols = -c(Company, Industry), 
                   names_to = c(".value", "Year"),
                   names_sep = "\\.") %>% type.convert()

#  Company  Industry       Year Sales EBITDA
#  <fct>    <fct>         <int> <int>  <int>
#1 CompanyA Manufacturing  2015   100     10
#2 CompanyA Manufacturing  2016   110     11
#3 CompanyA Manufacturing  2017   120     12
#4 CompanyB Telecom        2015   500     50
#5 CompanyB Telecom        2016   550     55
#6 CompanyB Telecom        2017   600     60
#7 CompanyC Services       2015  1000    100
#8 CompanyC Services       2016  1100    110
#9 CompanyC Services       2017  1200    120
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Base R solution:

df_long <- 

  reshape(df_wide,

        direction = "long",

        varying = which(!names(df_wide) %in% c("Company", "Industry")),

        ids = NULL,

        new.row.names = 1:(length(which(!names(df_wide) %in% c("Company", "Industry"))) * nrow(df_wide))

        )
hello_friend
  • 5,682
  • 1
  • 11
  • 15
0

I'm not familiar yet with pivot_longer() but here is a data.table solution:

library(data.table)    
setDT(df_wide)    
melt(
  df_wide, 
  id.vars = c("Company", "Industry"), 
  measure.vars = patterns(Sales = 'Sales', EBITDA = 'EBITDA'),
  variable.name = "Year"
)[, Year := (2015:2017)[Year]]

    Company      Industry Year Sales EBITDA
1: CompanyA Manufacturing 2015   100     10
2: CompanyB       Telecom 2015   500     50
3: CompanyC      Services 2015  1000    100
4: CompanyA Manufacturing 2016   110     11
5: CompanyB       Telecom 2016   550     55
6: CompanyC      Services 2016  1100    110
7: CompanyA Manufacturing 2017   120     12
8: CompanyB       Telecom 2017   600     60
9: CompanyC      Services 2017  1200    120
s_baldur
  • 29,441
  • 4
  • 36
  • 69
0

Here is a solution with base R (similar to the solution by @hello_friend), where reshape() is used to make table from wide to long:

df_long <- reshape(df_wide,
        direction = "long",
        varying = seq(df_wide)[-(1:2)],
        ids = NULL,
        timevar = "Year",
        times = unique(gsub("\\w+\\.(.*)","\\1",names(df_wide[-(1:2)]))),
        new.row.names = seq(ncol(df_wide[-(1:2)])*nrow(df_wide))
        )

such that

> df_long
   Company      Industry Year Sales EBITDA
1 CompanyA Manufacturing 2015   100     10
2 CompanyB       Telecom 2015   500     50
3 CompanyC      Services 2015  1000    100
4 CompanyA Manufacturing 2016   110     11
5 CompanyB       Telecom 2016   550     55
6 CompanyC      Services 2016  1100    110
7 CompanyA Manufacturing 2017   120     12
8 CompanyB       Telecom 2017   600     60
9 CompanyC      Services 2017  1200    120
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81