Consider empirically estimating the conditional distribution discrete in both X
and Y
,
Pr(Y|X)
Both variables have been mapped to integer sets such that
X in {1, ..., N_X} and Y in {1, ..., N_Y}
I have a dataframe of observations obs
, such that obs$x[t]
and obs$y[t]
are my observed X
and Y
values for event t
.
My question then is, what is the most efficient way to convert obs
into a matrix F
containing the empirical distributions such that
F[i,j] = sum((obs$x == i) & (obs$y == j))/sum(obs$x == i)
Of course I can use a double for loop for i in (1:N_X)
and j in (1:N_Y)
but I'm looking for the most efficient way.