1

I have a wide data frame I'm trying to convert to long. I also want to capture a new column in the long format.

Been looking at a couple posts but getting stuck.

For example, I have this data:

State   NumTotal    Num1    Num2    Num3    Num4
A       352         98      10      174     70
B       331         233     42      30      26

I would like this data:

State   NumClass    Num
A       Num1        98
A       Num2        10
A       Num3        174
A       Num4        70
B       Num1        233
B       Num2        42
B       Num3        30
B       Num4        26

The only actual output I can get is from this code (with df_so being the data frame with the data):

long <- df_so %>%
  gather(State, NumTotal)

long

which does this:

State   NumTotal
Num1    98
Num1    233
Num2    10
Num2    42
Num3    174
Num3    30
Num4    70
Num4    26

So I'm losing the State value (A or B) and know I need to generate a NumClass column but don't see how.

md2614
  • 353
  • 2
  • 14

2 Answers2

2

You can use melt function, from data.table (there's also reshape2::melt but it's no longer maintened so data.table::melt should be favored)

library(data.table)
setDT(df)
df2 <- melt(df, id.vars = "State")
data.table::setnames(df2, old = "variable", new = "NumClass")
linog
  • 5,786
  • 3
  • 14
  • 28
1

The successor to gather is pivot_longer, which does exactly what you want:

library(tidyr)
library(dplyr)

data <- tibble(State = c("A", "B"),
               NumTotal = c(352, 331),
               Num1 = c(98,233),
               Num2 = c(10, 42),
               Num3 = c(174, 30),
               num4 = c(70, 26))

data %>% 
  select(-NumTotal) %>% 
  pivot_longer(starts_with("Num"))

#> # A tibble: 8 x 3
#>   State name  value
#>   <chr> <chr> <dbl>
#> 1 A     Num1     98
#> 2 A     Num2     10
#> 3 A     Num3    174
#> 4 A     num4     70
#> 5 B     Num1    233
#> 6 B     Num2     42
#> 7 B     Num3     30
#> 8 B     num4     26

First I deselect the NumTotal column, since you don't use it, then pivot the table to long format by matching the name prefix of the columns you want pivoted. To get the output column names you want at the same time, do the pivot_longer call this way:

data %>% 
  select(-NumTotal) %>% 
  pivot_longer(starts_with("Num"), names_to = "NumClass", values_to = "NumTotal")

#> # A tibble: 8 x 3
#>   State NumClass NumTotal
#>   <chr> <chr>       <dbl>
#> 1 A     Num1           98
#> 2 A     Num2           10
#> 3 A     Num3          174
#> 4 A     num4           70
#> 5 B     Num1          233
#> 6 B     Num2           42
#> 7 B     Num3           30
#> 8 B     num4           26
Ajar
  • 1,786
  • 2
  • 15
  • 23