I am trying to create a summary table that tells me a Bikes usagewithin a Borough. The formula for which is
(No. of times a Bike is rented in particular Borough) / (Total No of rentals in that Borough).
Final output should look something like this.
BikeId Borough Pct
1 K&C 0.02
1 Hammersmith 0.45
7 K&C 0.32
To achieve that I am trying to implement a function as below:
smplData <- function(df) {
#initialize an empty dataframe
summDf <- data.frame(BikeId = character(), Borough = character(), Pct =
double())
#create a vector of unique borough names
boro <- unique(df[,"Start.Borough"])
for (i in 1:length(boro)){
#looping through each borough and create a freq table
bkCntBor<- table(df[df$Start.Borough==boro[i],"Bike.Id"])
#total number of rentals in a particular borough
borCnt <- nrow(df[df$Start.Borough==boro[i],])
for (j in 1:length(bkCntBor)){
#looping thru each bike for the ith borough and calculate ratio of jth bike
bkPct <- as.vector(bkCntBor[j])/borCnt
#temp dataframe to store a single row corresponding to bike, boro and ratio
dfTmp <- data.frame(BikeId = names(bkCntBor[j]), Borough = boro[i],
Pct = bkPct)
#append to summary table
summDf <<- rbind(summDf, dfTmp)
}
}
}
The head of the df dataset is as below
>head(df)
Bike.Id Start.Borough Rental.Id
1 K&C 61349872
1 K&C 61361611
1 Royal Parks 61362295
1 K&C 61364627
1 K&C 61367817
1 H&F 61368333
When I run the function after inserting one record in summDf I get the below error
Error in data.frame(BikeId = names(bkCntBor[j]), Borough = boro[i], Pct = bkPct) : arguments imply differing number of rows: 0, 1
I can the run the function code in the console by passing one value at a time for i and j. But when I run it as a function I get the error mentioned above.
Any help you guys can provide will be amazing.
Here is some sample data for the same.
Bike.Id Start.Borough
1 K&C
1 K&C
1 K&C
7 K&C
7 K&C
1 Hammersmith
1 Hammersmith
7 Hammersmith
9 Hammersmith
9 Westminster