The data I am working with is from eBird, and I am looking to sort out species occurrence by both name and year. There are over 30k individual observations, each with its own number of birds. From the raw data I posted below, on Jan 1, 2021 and someone observed 2 Cooper's Hawks, etc.
Raw looks like this:
specificName indivualCount eventDate year
Cooper's Hawk 1 (1/1/2018) 2018
Cooper's Hawk 1 (1/1/2020) 2020
Cooper's Hawk 2 (1/1/2021) 2021
Ideally, I would be able to group all the Cooper's Hawks specificName
by the year
they were observed and sum the total invidualcounts
. That way I can make statistical comparisons between the number of birds observed in 2018, 2019, 2020, & 2021.
I created the separate column for the year
year <- as.POSIXct(ebird.df$eventDate, format = "%m/%d/%Y") ebird.df$year <- as.numeric(format(year, "%Y"))
Then aggregated with the follwing:
aggdata <- aggregate(ebird.df$individualCount , by = list( ebird.df$specificname, ebird.df$year ), FUN = sum)
There are hundreds of bird species, so Cooper's Hawks start on the 115th row so the output looks like this:
Group.1 Group.2 x
115 2018 Cooper's Hawk 86
116 2019 Cooper's Hawk 152
117 2020 Cooper's Hawk 221
118 2021 Cooper's Hawk 116
My question is how to I get the data to into a table that looks like the following:
Species Name 2018 2019 2020 2021
Cooper's Hawk 86 152 221 116
I want to eventually run some basic ecology stats on the data using vegan
, but one problem first I guess lol
Thanks!