I'm trying to check my math where I am adding two columns to create a new column using this per this article:
df$TotalAnimalMathCorrect <- sapply(df$TotalAnimals, identical, df$TotalFemales+df$TotalMales))
I am looking for any FALSE values that would indicate that my summation isn't working right.
I calculate female and male animals using this:
df$TotalMales <- apply(subset(df, select = c(Gender.1,Gender.2,Gender.3,Gender.4)), 1, function(x) length(which(x=="Male")))
#convert to a numeric variable
quote_data_in$TotalMales<- as.numeric(quote_data_in$TotalMales)
and
df$TotalFemales <- apply(subset(df, select = c(Gender.1,Gender.2,Gender.3,Gender.4)), 1, function(x) length(which(x=="Female")))
#convert to a numeric variable
quote_data_in$TotalFemales<- as.numeric(quote_data_in$TotalFemales)
When I look at the data, I can see that I am adding correctly but since I have 170,000 rows, I'd like to do a double check by seeing if the TotalAnimals always equals the sum of the Female and Male animals.
But ... I am always getting FALSE for all values in my df$TotalAnimalMathCorrect, even when I can see that 1+1 = 2, the value in df$TotalAnimalMathCorrect.
I've also checked and confirmed that all three columns are numeric, and had applied an as.numeric before adding the numbers as you can see above and here
> str(df$TotalMales)
num [1:16929] 1 0 0 1 0 0 0 0 0 0 ...
> str(df$TotalFemales)
num [1:16929] 0 1 1 0 1 0 2 1 1 0 ...
> str(df$TotalAnimals)
num [1:16929] 1 1 1 1 1 1 2 1 1 1 ...
I also tried converting the variables to integer with as.integer instead of as.numeric, to be more specific but still every row has a FALSE for the TotalAnimalMathCorrect column.
Any ideas as to why the identical call isn't giving a TRUE when the numbers clearly match? I read the documentation on identical here
Here's some sample data of what I expect:
> TotalFemales TotalFemales TotalAnimals TotalAnimalMathCorrect
> 1 1 2 TRUE
but, like I said, I'm getting this:
TotalFemales TotalFemales TotalAnimals TotalAnimalMathCorrect
1 1 2 FALSE
Here is reproducible code.
df<- data.frame(TotalMales=c(1,1,0),TotalFemales=c(1,0,0),TotalAnimals=c(2,1,0))
TotalMales TotalFemales TotalAnimals
1 1 1 2
2 1 0 1
3 0 0 0
Thanks very much!