I am trying to segment Census data from fairly deaggregated data (e.g. age variables in 5-yr groups), & creating summary variables based on aggregation (e.g. all males 18+ per county). My solution is rowSums, e.g. county$MalesOver18 <- rowSums(county[,c(68:87)])
, where vars 68-87 sum to males 18+ -- works fine. However, with 500 variables it is not efficient to count out the order of my start/end columns.
But when I use my preferred solution, column names for rowSums (e.g. rowSums(county[,c(H76007:H76025)]
, where H vars = field names), I get one of 2 msg errors:
run w/ col names in quotes: Error in "H76007":"H76025" : NA/NaN argument
In addition: Warning messages:
1: In
[.data.frame(county, , c("H76007":"H76025")) :
NAs introduced by coercion
2: In
[.data.frame(county, , c("H76007":"H76025")) :
NAs introduced by coercion
run w/ col names not in quotes: Error in
[.data.frame(county, , c(H76007:H76025)) :
object 'H76007' not found
I have tried using the na.rm command & setting my variables as numeric -- although they are already integers -- and all to no result.
any guidance? thanks.