1

I have some data in R that looks like this.

year  freq
<int> <int>
1902    2           
1903    2           
1905    1           
1906    4           
1907    1           
1908    1           
1909    1           
1912    1           
1914    1           
1915    1

The data was read in using the following code.

data = read.csv("earthquakes.csv")
my_var <- c('year')
new_data <- data[my_var]
counts <- count(data, 'year')

This is 1 page of a 7 page table. I need to fill in the missing years with a count of 0 from 1900-1999. How would I go about this? I haven't been able to find an example online where year is the primary column.

  • please provide a reprex https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Bruno Oct 20 '21 at 17:06

2 Answers2

2

We may use complete on the 'counts' data

library(tidyr)
complete(counts, year = 1990:1999, fill = list(freq = 0))
akrun
  • 874,273
  • 37
  • 540
  • 662
0

1) Convert the input, shown in the Note, to zoo class and then to ts class. The latter will fill iln the missing years with NA. Replace the NA's with 0, convert back to data frame and set the names to the original names.

If a ts series is ok as output then omit the last two lines. If in addition it is ok to use NA rather than 0 then omit the last three lines.

library(zoo)

DF |>
  read.zoo() |>
  as.ts() |>
  na.fill(0) |>
  fortify.zoo() |>
  setNames(names(DF))

giving:

   year freq
1  1902    2
2  1903    2
3  1904    0
4  1905    1
5  1906    4
6  1907    1
7  1908    1
8  1909    1
9  1910    0
10 1911    0
11 1912    1
12 1913    0
13 1914    1
14 1915    1

2) for a base solution use merge. Omit the last line if NA is ok instead of 0.

m <- merge(DF, data.frame(year = min(DF$year):max(DF$year)), all = TRUE)
transform(m, freq = replace(freq, is.na(freq), 0))

Note

Lines <- "year  freq
1902    2           
1903    2           
1905    1           
1906    4           
1907    1           
1908    1           
1909    1           
1912    1           
1914    1           
1915    1"
    
DF <- read.table(text = Lines, header = TRUE)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341