Overview
I think what's happening is that there are variations in the number of parts within each ws_category
. To remedy this, you'll have to transform your data from one row for each ws_cartegory
, to one row for each ws_category
and their corresponding parts.
To do this, humor me with a baseball reference. Lots of great players never play for teams that win a World Series while some seem to find themselves earning multiple rings over their caeers.
Here, df
contains three rows, one for Ron Santo, Henry Blanco, and John Lester. Both Santo and Blanco never played for a team that won the World Series; however, Lester was part of two championship teams.
To expand df
so that it has one row per baseball player and their corresponding World Series championship year(s), two solutions come to mind:
tidyverse: Use tidyr::unnest()
; or
Use both the base
and utils
packages to stack()
the unlisted objects within the World Series column.
Code
# load necessary packages
library( tidyverse )
# make data
df <-
data.frame( Name = c("Ron Santo", "Henry Blanco", "John Lester") )
# add WS Championship Years
df$WS_Champion <-
list( NA, NA, c(2013, 2016) )
# view results
df
# Name WS_Champion
# 1 Ron Santo NA
# 2 Henry Blanco NA
# 3 John Lester 2013, 2016
# base R solution
# name the objects within the list column
# with their corresponding `Name` value
names( df$WS_Champion ) <- df$Name
# unlist each object within the list column
# and stack the vectors into a data frame
df.stacked <-
utils::stack( x = lapply( X = df$WS_Champion, FUN = unlist ) )
# rename the columns
colnames( df.stacked ) <- c("WS_Champion", "Name")
# view results
df.stacked
# WS_Champion Name
# 1 NA Ron Santo
# 2 NA Henry Blanco
# 3 2013 John Lester
# 4 2016 John Lester
# tidyverse solution
# unnest df so that 'Name' repeats for every value in 'WS_Champion'
df <-
unnest( data = df )
# view results
df
# Name WS_Champion
# 1 Ron Santo NA
# 2 Henry Blanco NA
# 3 John Lester 2013
# 4 John Lester 2016
# end of script #
Session Info
R version 3.4.4 (2018-03-15)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] forcats_0.3.0 stringr_1.3.0 dplyr_0.7.4 purrr_0.2.4
[5] readr_1.1.1 tidyr_0.8.0 tibble_1.4.2 ggplot2_2.2.1
[9] tidyverse_1.2.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.16 cellranger_1.1.0 pillar_1.2.1
[4] compiler_3.4.4 plyr_1.8.4 bindr_0.1.1
[7] tools_3.4.4 lubridate_1.7.3 jsonlite_1.5
[10] nlme_3.1-131.1 gtable_0.2.0 lattice_0.20-35
[13] pkgconfig_2.0.1 rlang_0.2.0 psych_1.7.8
[16] cli_1.0.0 rstudioapi_0.7 yaml_2.1.18
[19] parallel_3.4.4 haven_1.1.1 bindrcpp_0.2
[22] xml2_1.2.0 httr_1.3.1 hms_0.4.2
[25] grid_3.4.4 glue_1.2.0 R6_2.2.2
[28] readxl_1.0.0 foreign_0.8-69 modelr_0.1.1
[31] reshape2_1.4.3 magrittr_1.5 scales_0.5.0
[34] rvest_0.3.2 assertthat_0.2.0 mnormt_1.5-5
[37] colorspace_1.3-2 stringi_1.1.7 lazyeval_0.2.1
[40] munsell_0.4.3 broom_0.4.3 crayon_1.3.4