0

I have a large dataset of Bird observations. I would like to count by groups i.e. the species observed by various categories: year, season, and grid.

For example how many American Crows (AMCR) were observed in 2017? Or how many American Robins were observed in 2017 in Breeding season (BB column)?

Here's an example of my headers and first line of data:

Data Headers

enter image description here

Year    Season  Date    Grid    Species Count   Behavior
2015     BB   22-Jul-15  FF       AMCR     1        C

I tried to use the dplyr count_ and group_by but I think I'm doing it wrong. Please help!

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 3
    It would be helpful to have a reproducible example. You can perform a `dput()` function on your dataframe and paste that into your post. – Jacky Feb 06 '20 at 00:33
  • 1
    `df %>% group_by(year,season,grid) %>% summarize(n=n())` could be something you're looking for. – Jacky Feb 06 '20 at 00:34
  • Does this answer your question? [Frequency count of two column in R](https://stackoverflow.com/questions/10879551/frequency-count-of-two-column-in-r) – camille Feb 06 '20 at 03:42

2 Answers2

1

It sounds like you're trying to count the number of observations within group. This is what count in dplyr is designed for. The trick is that you don't need a group_by before it.

Here is some example code:

library(dplyr)
data("storms")

count_by_group <- storms %>%
  # The variables you want to count observations within
  count(year, month, status)

Alternately, if you have a variable called "Count" in your raw data and you want to sum it up within each group, you should instead use summarize with group_by

sum_by_group <- storms %>%
  group_by(year, month, status) %>%
  # pressure doesn't make a lot of sense here, but just whatever variable you're trying to sum up
  summarize(Count = sum(pressure))
NickCHK
  • 1,093
  • 7
  • 17
0

Here is other solution using dplyr. It is similar to the previously suggested; however, I think it might be closer to what you want to do.
To count the number of observed species by year, season and grid:

#Count number of species
df %>%
  #Grouping variables
  group_by(Year, Season, Grid) %>%
  #Remove possible duplicates in the species column
  distinct(Species) %>%
  #Count number of species
  count(name = "SpCount")

To count the number of observed birds by species, year, season and grid:

#Count number of birds per species
df %>%
  #Grouping variables
  group_by(Species, Year, Season, Grid) %>%
  #Count number of birds per species
  summarize(BirdCount = sum(Count))