0

dplyr::tally is faster than dplyr::count. Why doesn't tally read alpha variables in a function?

for sample x for this example say:

x <- data.frame("PrecinctID" = c(101,102,103,104))


tally(x,PrecinctID == 101)[1,1] 

#[1] 919 

findy <- function(y) {tally(x,PrecinctID == y)[1,1]} 

findy(101)

#Error: object 'y' not found

findy <- function(y) {count(x,PrecinctID == y)[2,2]} 

findy(101)  

#Source: local data frame [1 x 1]

#    n
#1 919

[Self answer:]

I was able to solve my own problem. Tally accepts only tbl data. So whether you use tally or summarise, it works well to pump it through dplyr pipe (%>%) or "then" operator. Once you do that, quite complex fields embedded with queries can be orchestrated. Given x is large voter database:

tbl_df(x)
Source: local data frame [128,438 x 17] ...

StateVoterID RegistrationNumber LastName FirstName ...

uPID <- sort(unique(x$PrecinctID))  
findP <- function(y) {  
x %>%  
summarise(  
Count = sum(PrecinctID == y),  
Good = sum(AVReturnStatus == "Good" & PrecinctID == y),  
Late = sum(AVReturnChallenge == "Too Late" & PrecinctID == y))  
}  

u1 <- t(sapply(uPID,findP))  
u1 <- cbind(uPID,u1)  


head(u1)  
     uPID Count Good Late  
[1,] 101  917   476  4   
[2,] 102  630   367  8   
[3,] 103  687   482  2   
[4,] 104  439   312  1   
[5,] 105  414   252  0   
[6,] 106  778   422  2   
rferrisx
  • 1,598
  • 2
  • 12
  • 14
  • 1
    Example is not reproducible. What are x and y? – Pierre Lapointe Nov 07 '15 at 19:31
  • x is a large data.frame. y is the argument for the function, in this case '101'. You can use your own fields or data.frames. I want to know if there is a workaround to allow it to read an alpha variable in a function. – rferrisx Nov 07 '15 at 20:33
  • 1
    Please read this: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Pierre Lapointe Nov 07 '15 at 20:37
  • Is this a question? If you want to self-answer, you should put the answer in the box lower down the page. – Frank Nov 08 '15 at 17:39

0 Answers0