3

I'm having trouble rearranging the following data frame with tidyr package:

data <- data.frame(
    name = rep(c("John", "Mary", "Peter", "Sarah"), each=2),
    firm = c("a", "b", "c", "d", "a", "b", "c", "d"),
    rank = rep(1:2, 4),
    value = rnorm(8)
    )

I want to reshape it so that each unique "name" variable is a rowname, with the "values" as observations along that row and the "rank" as colnames followed by the "firm" name. Sort of like this:

  name          1      firm_1            2       firm_2
  John       0.3407997      a        -0.3795377      b
  Mary      -0.8981073      c       -0.5013782       d
  Peter     0.3407997       a        -0.3795377      b
  Sarah     -0.8981073      c       -0.5013782       d
Marie-Eve
  • 565
  • 4
  • 15
  • Try `library(data.table);dcast(setDT(data), name ~ rank, value.var = c("firm", "value"))` – akrun Apr 26 '18 at 14:51
  • you can add `[,c(1,4,2,5,3)]` behind akrun's solution to sort it like yours. – Andre Elrico Apr 26 '18 at 14:58
  • 1
    This is very closely related to [this question](https://stackoverflow.com/questions/30592094/r-spreading-multiple-columns-with-tidyr) and [this question](https://stackoverflow.com/questions/43695424/tidyr-spread-multiple-columns). You might try solutions there and then see where you get stuck. – aosmith Apr 26 '18 at 15:43

1 Answers1

1

We can use a combination of dplyr and tidyr, similarly to the posts in the comment by @aosmith.

library(dplyr) # [1] ‘1.0.0’
library(tidyr) # [1] ‘1.1.0’

data %>% pivot_wider(names_from = rank, values_from = c(firm, value)) %>%
        select(name, `1` = value_1, firm_1, `2` = value_2, firm_2)

In order to fully go from long to wide format we must take values_from not 1, but 2, columns, as the original data has 4, not 3, columns.

kmacierzanka
  • 747
  • 4
  • 17