1

I have a dateset like this:

##current state of data
team= c("a","a","a","a","b","b","b","b")
situation=c("fives","fives","short","short","fives","fives","short","short")
year= c("2014","2015","2014","2015","2014","2015","2014","2015")
shots= runif(8, min = 2, max=10)
saves= runif(8, min = 1, max=6)
df = data.frame(team,year,situation, shots, saves)

I want the dataset to look like this:

team = c("a","b")
fives2014shots = runif(2, min = 2, max=10)
fives2015shots = runif(2, min = 2, max=10)
short2014shots = runif(2, min = 2, max=10)
short2015shots = runif(2, min = 2, max=10)
fives2014saves = runif(2, min = 1, max=6)
fives2015saves = runif(2, min = 1, max=6)
short2014saves = runif(2, min = 1, max=6)
short2015saves = runif(2, min = 1, max=6)
data.frame(team, fives2014shots,fives2015shots, 
fives2014saves,fives2015saves, short2014shots, short2015shots, 
short2014saves, short2015saves)

This code gives me the closest result but it only shows the 'saves' numeric variable and I need 'saves' and 'shots' to now show as part of the new column names:

library(reshape)
cast(df, team ~ year + situation)

Thank you!

1 Answers1

2

You can do this easily with tidyr.

library(tidyr)

df %>% 
  gather("var", "val", -team, -year, -situation) %>% 
  unite(key, situation, year, var) %>% 
  spread(key, val)

#>   team fives_2014_saves fives_2014_shots fives_2015_saves fives_2015_shots
#> 1    a         3.957574         5.148969         1.909761         6.492150
#> 2    b         4.301518         6.643775         2.923988         3.879531
#>   short_2014_saves short_2014_shots short_2015_saves short_2015_shots
#> 1         4.696710         9.267883         4.178224         9.285276
#> 2         5.886455         9.417314         4.179030         6.485290
austensen
  • 2,857
  • 13
  • 24