With the following data:
> dput(head(smart1,16))
structure(list(propertyID = c(233213.22, 233213.22, 233213.22,
233213.22, 233213.22, 233213.22, 233213.22, 233213.22, 233213.22,
233213.22, 233216.55, 233216.55, 233216.55, 233216.55, 233216.55,
233216.55), UptodateTarget = c(1, 0, 1, 0, 0, 0, 0, 0, 1, 0,
0, 0, 1, 0, 1, 1), hourUTC = c(20, 14, 14, 18, 14, 18, 14, 15,
15, 21, 15, 20, 15, 14, 19, 21)), row.names = 26:41, class = "data.frame")
> data.frame(smart1)
propertyID UptodateTarget hourUTC
26 233213.2 1 20
27 233213.2 0 14
28 233213.2 1 14
29 233213.2 0 18
30 233213.2 0 14
31 233213.2 0 18
32 233213.2 0 14
33 233213.2 0 15
34 233213.2 1 15
35 233213.2 0 21
36 233216.5 0 15
37 233216.5 0 20
38 233216.5 1 15
39 233216.5 0 14
40 233216.5 1 19
41 233216.5 1 21
I am trying to add a third column, "probability" that creates this output:
> data.frame(smart1)
propertyID UptodateTarget hourUTC probability
26 233213.2 1 20 1
27 233213.2 0 14 0.25
28 233213.2 1 14 0.25
29 233213.2 0 18 0
30 233213.2 0 14 0.25
31 233213.2 0 18 0
32 233213.2 0 14 0.25
33 233213.2 0 15 0.5
34 233213.2 1 15 0.5
35 233213.2 0 21 0
36 233216.5 0 15 0.5
37 233216.5 0 20 0
38 233216.5 1 15 0.5
39 233216.5 0 14 0
40 233216.5 1 19 1
41 233216.5 1 21 1
I want the probability
column to calculate the probability
of a propertyID
having an UptodateTarget
in a given hourUTC
. It should sum the number of UptodateTarget
for each propertyID
in each hourUTC
divided by the count of all UptodateTarget
for each propertyID
in each hourUTC