I have a dataframe (see below) with 4 pieces per machine and a run time for each piece. I would like to bin the run time into bins of every 50 hours then calculate the empirical probability of the run times.
I have attempted to expand the grid to get the bins however I think it replicates it too much and the probabilities are inflated.
library(tidyverse)
set.seed(1)
data <- tibble(piece = rep(c("A", "B", "C", "D"), 1000),
machine = rep(c("Mach1", "Mach2"), times = c(1200, 2800)),
time = runif(4000, 0, 1000))
I expect the output to look something like this (note that these probabilities will not match the data provided above).
piece machine time prob
A Mach1 50 .03
A Mach1 100 .04
A Mach1 150 .09
A Mach1 200 .12
...
A Mach1 1000 1.0
...
B Mach1 50 .05
B Mach1 100 .11
B Mach1 150 .12
B Mach1 200 .14
...
B Mach1 1000 1.0
.
.
.
A Mach2 50 .02
A Mach2 100 .05
...
B Mach2 50 .06
B Mach2 100 .10
...
I would like to use dplyr
if possible to retain my pipe structure.