0

I would like to create columns based on the State, to count the total count for the year.

test <- data.frame(Year=c(rep(2005,3),rep(2006,3),rep(2007,3)), State = rep(c("AK","AZ","CA"),3), count=c(12312,43243,1234,6737,56,123,1,6,8))
test
  Year State count
1 2005    AK 12312
2 2005    AZ 43243
3 2005    CA  1234
4 2006    AK  6737
5 2006    AZ    56
6 2006    CA   123
7 2007    AK     1
8 2007    AZ     6
9 2007    CA     8 

Below is my desired output:

answer <- data.frame(State=c("AK","AZ","CA"), count2005=c(12312,43243,1234), count2006=c(6737,56,123), count2007=c(1,6,8))
answer
  State count2005 count2006 count2007
1    AK     12312      6737         1
2    AZ     43243        56         6
3    CA      1234       123         8
Skaqqs
  • 4,010
  • 1
  • 7
  • 21
nubprog
  • 31
  • 3
  • 3
    Looks like pivot long to wide to me (https://tidyr.tidyverse.org/reference/pivot_wider.html) – Skaqqs Jan 20 '22 at 15:28
  • 2
    `test %>% pivot_wider(State, names_from = Year, values_from = count) ` – Maël Jan 20 '22 at 15:33
  • 1
    You can add to the @Maël's answer `names_prefix = "count"` like this: `test %>% pivot_wider(State, names_from = Year, names_prefix = "count", values_from = count)` And you will get the desired column names. Cheers. – lovalery Jan 20 '22 at 15:39
  • It's confusing that you refer to summarizing and counting data in both the title and text, since the operation you're actually doing is just reshaping, not actually counting anything new – camille Jan 20 '22 at 15:53

1 Answers1

1

1) pivot_wider We can use pivot_wider like this

library(tidyr)

test %>% 
  pivot_wider(names_from = Year, values_from = count, names_prefix = "count")
## # A tibble: 3 x 4
##   State count2005 count2006 count2007
##   <chr>     <dbl>     <dbl>     <dbl>
## 1 AK        12312      6737         1
## 2 AZ        43243        56         6
## 3 CA         1234       123         8

2) reshape A base R solution using reshape is

reshape(test, dir = "wide", idvar = "State", timevar = "Year")
##   State count.2005 count.2006 count.2007
## 1    AK      12312       6737          1
## 2    AZ      43243         56          6
## 3    CA       1234        123          8

3) xtabs The question seems to want to label the horizontal axis as answer and that can't be done with a data frame. We can do that with a matrix or table but then we will need to put the states into row names rather than a column. If the precise names and dimnames are not important then omit the second line. This also uses only base R.

xt <- xtabs(count ~ State + Year, test)
dimnames(xt) <- list(rownames(xt), answer = paste0("count", colnames(xt)))
xt
##     answer
##      count2005 count2006 count2007
##   AK     12312      6737         1
##   AZ     43243        56         6
##   CA      1234       123         8
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • I think my edit to the question may have been misleading. As far as I know, OP did not want to label the axis answer; I typed `answer` to simulate the print in the console. Sorry for wasting your time. – Skaqqs Jan 20 '22 at 15:45