My question is related to this one from 2013 R: Count unique values by category
using the following data in R:
set.seed(1)
mydf <- data.frame(
Cnty = rep(c("185", "31", "189"), times = c(5, 3, 2)),
Yr = c(rep(c("1999", "2000"), times = c(3, 2)),
"1999", "1999", "2000", "2000", "2000"),
Plt = "20001",
Spp = sample(c("Bitternut", "Pignut", "WO"), 10, replace = TRUE),
DBH = runif(10, 0, 15)
)
mydf
# Cnty Yr Plt Spp DBH
# 1 185 1999 20001 Bitternut 3.089619
# 2 185 1999 20001 Pignut 2.648351
# 3 185 1999 20001 Pignut 10.305343
# 4 185 2000 20001 WO 5.761556
# 5 185 2000 20001 Bitternut 11.547621
# 6 31 1999 20001 WO 7.465489
# 7 31 1999 20001 WO 10.764278
# 8 31 2000 20001 Pignut 14.878591
# 9 189 2000 20001 Pignut 5.700528
# 10 189 2000 20001 Bitternut 11.661678
What i'd like to be able to do and what was not done by the previous asker or answerers is:
Count how many counties each species exists in, which is very simply done with a table function
However, in my data there are over a million rows five different species and I don't know how many counties (a very large number anyway)
How could I get a table that gives me an answer of:
Species count_of_Counties
bitternut 2
pignut 3
WO 2
instead of the following answer:
Cnty
# Spp 185 189 31
# Bitternut 2 1 0
# Pignut 2 1 1
# WO 1 0 2
If I attempt this solution I will have well over 400,000 columns