Using DistributionFitTest[] on custom distributions in Mathematica 8

Question

I'm trying to compute the Goodness of Fit of a bi-modal Gaussian distribution. To do this, Mathematica seems to require a symbolic distribution function to which to compare. Because such a bi-modal distribution is not a stock distribution, I'm trying to define one. The obvious use of

MixtureDistribution[{fs,(1-fs),{NormalDistribution[\[mu]S,\[sigma]S],NormalDistribution[\[mu]L,\[sigma]L]}]

generates a distribution that can be plotted, but the analysis used by DistributionFitTest[] fails.

This topic has been addressed in previous questions in discussions between @Sasha and @Jagra:

DistributionFitTest[] for custom distributions in Mathematica

Minimizing NExpectation for a custom distribution in Mathematica

but I was unable to find a resolution that enabled the use of

DistributionFitTest[data,dist,"HypothesisTestData"]

when dist is not a built-in distribution type.

Because the distribution I'm modeling is composed of simple pieces, describing the properties of the distribution is not too difficult, and I have attempted to describe as many features as I know in order to create a well defined distribution that Mathematica 8 would recognize as one of its own. My attempt to define every parameter I can think of follows:

modelDist /: 
PDF[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], x_] :=
PDF[MixtureDistribution[{fS, 1 - fS}, {NormalDistribution[\[Mu]S, \[Sigma]S], NormalDistribution[\[Mu]L, \[Sigma]L]}], x];

modelDist /: 
CDF[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], x_] :=
CDF[MixtureDistribution[{fS, 1 - fS}, {NormalDistribution[\[Mu]S, \[Sigma]S], NormalDistribution[\[Mu]L, \[Sigma]L]}], x];

modelDist /: 
DistributionDomain[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_]] := 
Interval[{-Infinity, Infinity}];

modelDist /: 
Random`DistributionVector[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], n_, prec_] := 
RandomVariate[MixtureDistribution[{fS, 1 - fS}, {NormalDistribution[\[Mu]S, \[Sigma]S], NormalDistribution[\[Mu]L, \[Sigma]L]}], n, WorkingPrecision -> prec];

modelDist /: 
DistributionParameterQ[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_]] := 
!TrueQ[Not[Element[{fS, \[Mu]S, \[Sigma]S, \[Mu]L, \[Sigma]L}, Reals] && fS > 0 && fS < 1 && \[Sigma]S > 0 && \[Sigma]L > 0]];

modelDist /: 
DistributionParameterAssumptions[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_]] := 
Element[{fS, \[Mu]S, \[Sigma]S, \[Mu]L, \[Sigma]L}, Reals] && fS > 0 && fS < 1 && \[Sigma]S > 0 && \[Sigma]L > 0;

modelDist /: 
MomentGeneratingFunction[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], t_] := 
fS E^(t \[Mu]S + (t^2 \[Sigma]S^2)/2) + (1 - fS) E^(t \[Mu]L + (t^2 \[Sigma]L^2)/2);

modelDist /: 
CharacteristicFunction[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], t_] := 
fS E^(I t \[Mu]S + (t^2 \[Sigma]S^2)/2) + (1 - fS) E^(I t \[Mu]L + (t^2 \[Sigma]L^2)/2)

modelDist /: 
Moment[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_], n_] := 
Piecewise[{{fS*\[Sigma]S^n*(-1 + n)!!*Hypergeometric1F1[-(n/2), 1/2, -(\[Mu]S^2/(2*\[Sigma]S^2))] + (1 - fS) * \[Sigma]L^n*(-1 + n)!! * Hypergeometric1F1[-(n/2), 1/2, -(\[Mu]L^2/(2*\[Sigma]L^2))], Mod[n, 2] == 0}}, \[Mu]S*\[Sigma]S^(-1 + n)*n!!* Hypergeometric1F1[(1 - n)/2, 3/2, -(\[Mu]S^2/(2*\[Sigma]S^2))] + (1 - fS) * \[Mu]L*\[Sigma]L^(-1 + n)*n!! * Hypergeometric1F1[(1 - n)/2, 3/2, -(\[Mu]L^2/(2*\[Sigma]L^2))]];

modelDist /: 
Mean[modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_]] := 
fS \[Mu]S + (1 - fS) \[Mu]L

modelDist /: 
Expectation[expr_, x_ \[Distributed] modelDist[fS_, \[Mu]S_, \[Sigma]S_, \[Mu]L_, \[Sigma]L_]] := 
fS*Expectation[expr, x \[Distributed] NormalDistribution[\[Mu]S, \[Sigma]S]] + (1 - fS)*Expectation[expr, x \[Distributed] NormalDistribution[\[Mu]L, \[Sigma]L]]

Everything seems to work up through the definition of Expectation, which throws

TagSetDelayed::tagpos: Tag modelDist in Expectation[expr_,x_\[Distributed]modelDist[fS_,\[Mu]S_,\[Sigma]S_,\[Mu]L_,\[Sigma]L_]] is too deep for an assigned rule to be found.

I don't know that having a definition for the expectation will magically make everything work, but it's the next step to to try, as having the Expectation allows computation of the Variance, and for all I know, that is the last tag that I need to define. Is there a syntax that will properly define this Expectation[] and pass the expression straight from my modelDist[] to its constituent NormalDistribution[]s?

(And if this entirely the wrong way to go about this, some advice to that effect would be appreciated.)

Do you have values for fs, mu, sigma, L, and S? When I pick them, I get results using just the `MixtureDistribution` that you specified. When I test `RandomVariate` of the data with increasing size, I'd expect to get p values that approach 1. My values jump around randomly. They do that even if I use `NormalDistribution` instead of yours though. What do you mean when you say the analysis fails? — 0xFE, Dec 29 '12 at 05:11
If I run DistributionFitTest[] on a model set of data with the distribution I specified, it never finishes, it just runs and runs and runs... If I try to fit the same data with an ordinary gaussian (not bi-modal), then it works just fine. Many other distribution-related functions do work, but DistributionFitTest doesn't... since the method is rather opaque, I'm not sure what aspect(s) of my custom distribution are missing... — KDN, Dec 29 '12 at 13:19

Using DistributionFitTest[] on custom distributions in Mathematica 8

0 Answers0