-1

I have the data such as below

| Year | Qrtrs |Type |Freq |

| 1950 | JAS | BS | 1 |

| 1950 | OND | BS | 1 |

| 1950 | OND | BS | 1 |

| 1950 | OND | BY | 2 |


I would like to produce like this

+------+------+-----+----+----+
| Year | Qrts | All | BS | BY |
+------+------+-----+----+----+
| 1950 | JAS  |   1 |  1 |  0 |
| 1950 | OND  |   4 |  2 |  2 |
+------+------+-----+----+----+

I tried using tapply, but didn't get what I wanted.

Thank you!

cumulus
  • 19
  • 3
  • Please do not post an image of code/data/errors: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: https://meta.stackoverflow.com/a/285557/3358272. Please also include code you've tried; "didn't get what I wanted" tells us neither (1) what didn't work, nor (2) what you are expecting. – r2evans Nov 07 '18 at 01:58
  • Possible duplicate of [How to reshape data from long to wide format?](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) – phiver Nov 07 '18 at 12:35

1 Answers1

1

Your sample data, copyable and usable:

dat <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
Year Qrtrs Month Type freq
1 1950 JAS 9 BS 1
2 1950 OND 10 BS 1
3 1950 OND 11 BS 1
4 1950 OND 12 BY 1 ')

Using dplyr:

library(dplyr)
dat %>%
  group_by(Year, Qrtrs, Type) %>%
  summarize(count = sum(freq)) %>%
  tidyr::spread(Type, count, fill=0) %>%
  mutate(All = sum(BS:BY))
# # A tibble: 2 x 5
# # Groups:   Year, Qrtrs [2]
#    Year Qrtrs    BS    BY   All
#   <int> <chr> <dbl> <dbl> <int>
# 1  1950 JAS       1     0     1
# 2  1950 OND       2     1     3

Using data.table:

library(data.table)
DT <- as.data.table(dat)
dcast(DT[,.(count = sum(freq)), by=c("Year", "Qrtrs", "Type")],
      Year + Qrtrs ~ Type, value.var = "count", fun=sum)[
       , All := BS+BY,][]
#    Year Qrtrs BS BY All
# 1: 1950   JAS  1  0   1
# 2: 1950   OND  2  1   3

I think your expected output above is incorrect, its counts for BS and BY are inconsistent.

There's a base-R solution somewhere starting with

aggregate(freq ~ Year + Qrtrs + Type , data=dat, FUN=sum)

but I've run out of time ...

r2evans
  • 141,215
  • 6
  • 77
  • 149