0

I have a dataframe, called dets_per_month, that looks like so...

**Zone     month   yearcollected   total**
1         Jul        2017            183
1         Jul        2015            18
1         Aug        2015            202
1         Aug        2017            202
1         Aug        2017            150
1         Sep        2017            68
2         Apr        2018            65
2         Jun        2018            25
2         Sep        2018            278

I'm trying to input 0's for months where there are no totals in a particular zone. This is the code I tried using to input those 0's

complete(dets_per_month, nesting(zone, month), yearcollected = 2016:2018, fill = list(count = 0))

But the output of this doesn't give me any 0's, instead it adds on columns from my original dataframe. Can anyone tell me how to get 0's for this?

Kristen Cyr
  • 629
  • 5
  • 16
  • Couple things: The code up top for creating data is invalid. `c(Jan:Dec)` won't actually do anything. "the function mean() didn't input 0's when there was no observations for a particular month, year, and zone"—you didn't tell it to. If you want to add observations with missing data for all possible combinations, there are lots of ways to do that, including `tidyr::complete` or `expand.grid`. Other posts already cover that. Beyond that, we'll need a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – camille Dec 22 '19 at 19:27

1 Answers1

0

You could use complete after grouping by Zone and yearcollected. We can use month.abb which is in-built constant for month name in English.

library(dplyr)

df %>%
  group_by(Zone, yearcollected) %>%
  tidyr::complete(month = month.abb, fill = list(total = 0))

#    Zone yearcollected month total
#   <int>         <int> <chr> <dbl>
# 1     1          2015 Apr       0
# 2     1          2015 Aug     202
# 3     1          2015 Dec       0
# 4     1          2015 Feb       0
# 5     1          2015 Jan       0
# 6     1          2015 Jul      18
# 7     1          2015 Jun       0
# 8     1          2015 Mar       0
# 9     1          2015 May       0
#10     1          2015 Nov       0
# … with 27 more rows

data

df <- structure(list(Zone = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L), 
month = structure(c(3L, 3L, 2L, 2L, 2L, 5L, 1L, 4L, 5L), .Label = c("Apr", 
"Aug", "Jul", "Jun", "Sep"), class = "factor"), yearcollected = c(2017L, 
2015L, 2015L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L), 
total = c(183L, 18L, 202L, 202L, 150L, 68L, 65L, 25L, 278L
)), class = "data.frame", row.names = c(NA, -9L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213