I'm receiving an error in which I believe the root cause is that within my groupings there are not values across all groups.
Data can be downloaded here: https://opendata.miamidade.gov/311/311-Service-Requests-Miami-Dade-County/dj6j-qg5t
What I want to do is to have a function that takes a nested grouping and detects all of the holes and populates zeros. Lets take the following code sample:
d <- rDSamp %>%
FilterDateRange("Ticket.Created.Date...Time", "1/1/2013", "12/31/2013") %>%
group_by(Ticket.Created.Date...Time, Case.Owner) %>%
summarise(
count = n()
) %>%
arrange(Ticket.Created.Date...Time)
After the summarise, I need to add a function that goes through every date, and if the case owner does not exist in that date, create the case owner, and add a count of 0.
Here is the code to get to this point:
library("ggvis")
library("magrittr")
library("dplyr")
library("tidyr")
library("shiny")
library("checkpoint")
checkpoint("2016-03-29")
rData <- read.csv("C:\\data\\Miami_311.csv",
header=TRUE,
sep=",")
rDSamp <- rData[sample(1:length(rData$Case.Owner), 1000),]
rDSamp = rData %>%
subset(
Case.Owner == "Animal_Services" |
Case.Owner == "Waste_Management" |
Case.Owner == "Community_Information_and_Outreach" |
Case.Owner == "Waste_Management")
rDSamp$Case.Owner = factor(rDSamp$Case.Owner)
#Convert to known date time
rDSamp$Ticket.Created.Date...Time <-
rDSamp$Ticket.Created.Date...Time %>%
as.POSIXct(format="%m/%d/%Y") %>%
as.character()
FilterDateRange = function(data, feature, minDate, maxDate) {
minDate = minDate %>%
as.POSIXct(format="%m/%d/%Y") %>%
as.character()
maxDate = maxDate %>%
as.POSIXct(format="%m/%d/%Y") %>%
as.character()
result = subset(data, data[feature] <= maxDate)
subset(result, result[feature] >= minDate)
}
d <- rDSamp %>%
FilterDateRange("Ticket.Created.Date...Time", "1/1/2013", "12/31/2013") %>%
group_by(Ticket.Created.Date...Time, Case.Owner) %>%
summarise(
count = n()
) %>%
arrange(Ticket.Created.Date...Time)
For final information, I'm trying to use ggvis layer_smooths and it is reporting na's introduced by coersion, my assumption is holes in the data is causing this.
Found one solution, looking for more generic one...
FillDataHolesWithZeros = function(input){
countZero = input %>%
group_by(Ticket.Created.Date...Time) %>%
summarise(count = n()) %>%
filter(count < length(levels(input$Case.Owner)))
for(i in 1:nrow(countZero))
{
date = countZero[i,]$Ticket.Created.Date...Time
departments = input %>% filter(Ticket.Created.Date...Time == date)
myLevels = levels(input$Case.Owner)
for(j in 1:nrow(departments))
{
owner = departments[j,]$Case.Owner
myLevels = myLevels[myLevels != owner]
}
print(paste(i,":",myLevels))
for(k in 1:length(myLevels)){
input = input %>% rbind(data.frame(
Ticket.Created.Date...Time = date,
Case.Owner = myLevels[k],
count = 0
))
}
}
return(input)
}