Transforming a dataframe in R

Question

How to I convert df to df2? df2 is an expanded version of df1 with a new column called NUMPAGES added.

df <-data.frame(NAME=c("author1", "author2"), FIRSTBOOK = c(100, 120), SECONDBOOK = c(200, 150), THIRDBOOK = c(500, 100))

df
#     NAME FIRSTBOOK SECONDBOOK THIRDBOOK
#1 author1       100        200       500
#2 author2       120        150       100

df2 <- data.frame(NAME = c("author1", "author1", "author1", "author2", "author2", "author2"), BOOK = c("FIRSTBOOK", "SECONDBOOK", "THIRDBOOK", "FIRSTBOOK", "SECONDBOOK", "THIRDBOOK"), NUMPAGES = c(100, 200, 500, 120, 150, 100))

df2
#     NAME       BOOK NUMPAGES
#1 author1  FIRSTBOOK      100
#2 author1 SECONDBOOK      200
#3 author1  THIRDBOOK      500
#4 author2  FIRSTBOOK      120
#5 author2 SECONDBOOK      150
#6 author2  THIRDBOOK      100

score 2 · Answer 1 · answered Nov 23 '20 at 23:18

It can also be done with base R (no need of packages):

#Base R
Reshaped <- reshape(df,idvar = 'NAME',
        varying = list(names(df)[-1]),
        direction = 'long',times = names(df)[-1],
        timevar = 'Book',v.names = 'NUMPAGES')
rownames(Reshaped) <- NULL

Output:

Reshaped
     NAME       Book NUMPAGES
1 author1  FIRSTBOOK      100
2 author2  FIRSTBOOK      120
3 author1 SECONDBOOK      200
4 author2 SECONDBOOK      150
5 author1  THIRDBOOK      500
6 author2  THIRDBOOK      100

Or using reshape2:

library(reshape2)
#Code 2
Reshaped <- reshape2::melt(df,id.var='NAME',value.name = 'NUMPAGES')

Output:

Reshaped
     NAME   variable NUMPAGES
1 author1  FIRSTBOOK      100
2 author2  FIRSTBOOK      120
3 author1 SECONDBOOK      200
4 author2 SECONDBOOK      150
5 author1  THIRDBOOK      500
6 author2  THIRDBOOK      100

akrun · Answer 2 · 2020-11-23T23:32:56.780

We can use pivot_longer which is more general and can be extended to multiple scenarios

library(dplyr)
library(tidyr)
df %>% 
  pivot_longer(cols = -NAME, names_to = 'BOOK', values_to = 'NUMPAGES')

-output

# A tibble: 6 x 3
#  NAME    BOOK       NUMPAGES
#  <chr>   <chr>         <dbl>
#1 author1 FIRSTBOOK       100
#2 author1 SECONDBOOK      200
#3 author1 THIRDBOOK       500
#4 author2 FIRSTBOOK       120
#5 author2 SECONDBOOK      150
#6 author2 THIRDBOOK       100

Or using base R (no need of packages)

data.frame(NAME = rep(df$NAME, each = ncol(df)-1), 
   BOOK = rep(names(df)[-1], nrow(df)), NUMPAGES = unlist(df[-1]))

Transforming a dataframe in R

2 Answers2