1

I am having a data frame as shown below:

dat <- data.frame(QuarterYear = c("Q4 2019", "Q4 2019", "Q4 2019", 
                              "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", "Q4 2019", 
                              "Q4 2019", "Q4 2019", "Q4 2019", "Q1 2020", "Q1 2020", "Q1 2020", 
                              "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", "Q1 2020", 
                              "Q1 2020", "Q1 2020", "Q1 2020", "Q2 2020", "Q2 2020", "Q2 2020", 
                              "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", "Q2 2020", 
                              "Q2 2020", "Q2 2020", "Q2 2020", "Q3 2020", "Q3 2020", "Q3 2020", 
                              "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", "Q3 2020", 
                              "Q3 2020", "Q3 2020", "Q3 2020"), 
              Grade = c("Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11", "Grade 8", "Grade 8", 
                        "Grade 8", "Grade 9", "Grade 9", "Grade 9", "Grade 10", "Grade 10", 
                        "Grade 10", "Grade 11", "Grade 11", "Grade 11"), 
              Type = c("overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", 
                       "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", 
                       "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", 
                       "RT", "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", 
                       "overallAverage", "CT", "RT", "overallAverage", "CT", "RT", "overallAverage", 
                       "CT", "RT", "overallAverage", "CT", "RT"), 
              value = c(2.48, 2.21, 
                        0.27, 3.48, 3.03, 0.45, 4.6, 4, 0.6, 2.8, 2.4, 0.4, 2.54, 2.28, 
                        0.26, 3.45, 3, 0.45, 4.46, 3.88, 0.58, 3.56, 2.81, 0.75, 2.47, 
                        2.14, 0.33, 2.96, 2.54, 0.41, 4.1, 3.69, 0.41, 3.44, 2.61, 0.83, 
                        2, 1.81, 0.19, 2.54, 2.26, 0.28, 4.11, 3.68, 0.43, 2.67, 2.11, 
                        0.56), stringsAsFactors = FALSE)

I am trying to reshape this data frame into a wide-format where unique values of Type will be the rows and the values will be filled based on the QuarterYear and Grade.

To put in the simple words, if the first row is OverallAverage, the first 4 columns will be representing Q4 2019-Grade 8 to Q3 2020- Grade 8. The next 4 columns will be for Q4 2019-Grade 9 to Q3 2020-Grade 9 and so on.

I tried using the reshape function

widerDat <- reshape(dat, direction = "wide",idvar = "Type",timevar = "value")  

How can I combine QuarterYear and Grade to get the required output?

Please help me arrive at a suitable solution. Thanks in advance!!

jay.sf
  • 60,139
  • 8
  • 53
  • 110
Nevedha Ayyanar
  • 845
  • 9
  • 27
  • 2
    Does this work? dat %>% pivot_wider(names_from = c(QuarterYear, Grade), values_from = value) – Karthik S Oct 10 '20 at 08:04
  • 1
    Does this answer your question? [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) – prosoitos Oct 10 '20 at 08:11
  • @prosoitos Apparently not, OP experiences the special case of a split time variable. – jay.sf Oct 10 '20 at 08:28

2 Answers2

1

You could paste the time variables together and use it as a single time= variable like so:

res <- reshape(transform(dat, time=paste(QuarterYear, Grade)), 
               direction="wide", idvar="Type", timevar="time",
               drop=c("QuarterYear", "Grade"))  
res
#             Type value.Q4 2019 Grade 8 value.Q4 2019 Grade 9
# 1 overallAverage                  2.48                  3.48
# 2             CT                  2.21                  3.03
# 3             RT                  0.27                  0.45
#   value.Q4 2019 Grade 10 value.Q4 2019 Grade 11 value.Q1 2020 Grade 8
# 1                    4.6                    2.8                  2.54
# 2                    4.0                    2.4                  2.28
# 3                    0.6                    0.4                  0.26
#   value.Q1 2020 Grade 9 value.Q1 2020 Grade 10 value.Q1 2020 Grade 11
# 1                  3.45                   4.46                   3.56
# 2                  3.00                   3.88                   2.81
# 3                  0.45                   0.58                   0.75
#   value.Q2 2020 Grade 8 value.Q2 2020 Grade 9 value.Q2 2020 Grade 10
# 1                  2.47                  2.96                   4.10
# 2                  2.14                  2.54                   3.69
# 3                  0.33                  0.41                   0.41
#   value.Q2 2020 Grade 11 value.Q3 2020 Grade 8 value.Q3 2020 Grade 9
# 1                   3.44                  2.00                  2.54
# 2                   2.61                  1.81                  2.26
# 3                   0.83                  0.19                  0.28
#   value.Q3 2020 Grade 10 value.Q3 2020 Grade 11
# 1                   4.11                   2.67
# 2                   3.68                   2.11
# 3                   0.43                   0.56

To order the columns in the desired format, we can use substr.

nm <- names(res)[-1]  ## store names in a vector
## generate order vector by relevant characters
o <- order(as.double(substr(nm, 21, 22)), as.double(substr(nm, 10, 13)),
           as.double(substr(nm, 8, 8))) + 1
res <- res[c(1, o)]  ## ordering
names(res)
#  [1] "Type"                   "value.Q4 2019 Grade 8"  "value.Q1 2020 Grade 8" 
#  [4] "value.Q2 2020 Grade 8"  "value.Q3 2020 Grade 8"  "value.Q4 2019 Grade 9" 
#  [7] "value.Q1 2020 Grade 9"  "value.Q2 2020 Grade 9"  "value.Q3 2020 Grade 9" 
# [10] "value.Q4 2019 Grade 10" "value.Q1 2020 Grade 10" "value.Q2 2020 Grade 10"
# [13] "value.Q3 2020 Grade 10" "value.Q4 2019 Grade 11" "value.Q1 2020 Grade 11"
# [16] "value.Q2 2020 Grade 11" "value.Q3 2020 Grade 11"
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • How can I get the consecutive columns based on the quarter for each grade? For instance, the column order will be Q4 2019 Grade 8, Q1 2020 Grade 8, Q2 2020 Grade 8, Q3 2020 Grade, Q4 2019 Grade 9 going on. – Nevedha Ayyanar Oct 10 '20 at 08:14
  • @NevedhaAyyanar Usng `substr`, see edit. – jay.sf Oct 10 '20 at 08:23
1

I think this will do

library(tidyverse)

wider_data <- dat %>% mutate(new_col = paste(Grade,QuarterYear, sep = " ")) %>%
  select(Type, new_col, value) %>%
  pivot_wider(names_from = new_col, values_from = value)

for re-arranging the columns manually, use this

wider_data <- wider_data %>% select(1,2,6,10,14,3,7,11,15,4,8,12,16,5,9,13,17)
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45