How can I find the probability of occurrence for the pandas df below? I am trying to find the probability of a beer being associated to one store over others? My current event time is one day.
I have a dataframe like below:
eventtime name src_store
January 14, 2018 4:57:35 budlight NaN
January 14, 2018 4:51:31 coors 5-119
January 14, 2018 4:31:32 pabst NaN
January 14, 2018 4:57:31 budlight 5-118
January 14, 2018 4:58:21 coors 5-119
January 14, 2018 4:57:37 NaN 5-120
January 14, 2018 4:18:31 budlight 5-118
January 14, 2018 4:57:31 coors 5-119
January 14, 2018 4:57:52 NaN 5-120
Some code to give me a comparison matrix:
pd.crosstab(df.name, df.src_store)
src_store 5-118 5-119 5-120 NONE
name
NONE 0 0 2 0
budlight 2 0 0 1
coors 0 3 0 0
pabst 0 0 0 1
Trying to get the pvalues from this:
Name with src_store
Name without src_store
src_store with name
src_store without name
Overall goal is to find the probability a beer is correlated to a specific src_store.
Expected output (NOT The actual p_values):
eventtime name src_store p_value
January 14, 2018 4:57:35 budlight NaN 0.01
January 14, 2018 4:51:31 coors 5-119 0.02
January 14, 2018 4:31:32 pabst NaN 0
January 14, 2018 4:57:31 budlight 5-118 0.002
January 14, 2018 4:58:21 coors 5-119 0.004
January 14, 2018 4:57:37 NaN 5-120 0.005
January 14, 2018 4:18:31 budlight 5-118 0.006
January 14, 2018 4:57:31 coors 5-119 0.007
January 14, 2018 4:57:52 NaN 5-120 0.008