0

After years of lurking around on stack overflow, I finally post my first question as I cannot find a post describing my problem.

For one aspect of a project I plot the distribution of a parameter (orientation) contained in a dataframe (df) to find out that it adopts a bimodal distribution. I am showing here the dataframe for the "2.7" sample:

 A tibble: 5,280 x 13
   ID    number_of_points length  bend average_curvatu~ relative_z_chan~ average_z_height orientation Depth Scale AspectR
   <chr>            <dbl>  <dbl> <dbl>            <dbl>            <dbl>            <dbl>       <dbl> <dbl> <dbl>   <dbl>
 1 14-3~              935   940.  1.33             177.            0.291             171.       154.   0.35     6     2.7
 2 14-3~              629   629.  1.07             235.            0.346             467.        29.2  0.35     6     2.7
 3 14-3~              550   562.  1.18             159.            0.402             286.        22.5  0.35     6     2.7
 4 14-3~              334   334.  1.03             322.            0.507             444.        37    0.35     6     2.7
 5 14-3~              397   397.  1.01             292.            0.484             415.        16.4  0.35     6     2.7
 6 14-3~             1132  1135.  1.06             246.            0.301             401.        31.1  0.35     6     2.7
 7 14-3~             1169  1175.  1.14             179.            0.255             370.        11.9  0.35     6     2.7
 8 14-3~             1363  1366.  1.04             273.            0.183             383.        23.1  0.35     6     2.7
 9 14-3~              841   843.  1.09             274.            0.310             307.        21.5  0.35     6     2.7
10 14-3~              881   883.  1.16             210.            0.226             451.       164.   0.35     6     2.7
# ... with 5,270 more rows, and 2 more variables: Circularity <dbl>, lam <chr>

Using normalmixEM I am able to determine the parameters of the 2 gaussians that model the bimodal distribution.

my_mix <- mixtools::normalmixEM(df$orientation, lambda = NULL, mu = NULL, sigma = NULL, maxit = 5000)

I'm able to extract mu, sigma and lambda parameters of the curves and store them into a table (FIT). I am showing below the parameters of both modes for the sample "2.7":

A tibble: 10 x 5
   ID    lambda    mu sigma AspectR
   <chr>  <dbl> <dbl> <dbl>   <dbl>
 1 2.7    0.723  38.5  22.8     2.7
 2 2.7    0.277 150.   20.2     2.7 

I am able to generate a plot with a histogram and the 2 gaussians overlaid by running the following code:

p <- ggplot(df$orientation, aes(x = orientation)) +
         geom_histogram(binwidth = 5) +

 mapply(
      function(mean, sd, lambda, n, binwidth){
        stat_function(
          fun = function(x){
            (dnorm(x, mean = mean, sd = sd)) * n * binwidth * lambda
          }
        )
      },
      mean = FIT$mu,
      sd = FIT$sigma,
      lambda = FIT$lambda,
      n = length(df$orientation),
      binwidth = 5
    )
})

plot generated by above code

Now what I need to do is to estimate the counts in each of the modes. My plan to do this is to compute the full width at half maximum (FWHM) of each of the modes, get the range and find out the counts in the ranges.

I tried applying what I saw here (Finding the full width half maximum of a peak) but this seems to not be for R and also in here (Determining the FWHM from a distribution in R), but in the latter it's for a unimodal distribution.

My feeling is that I may be able to apply some function in the mapply function written above that can compute the FWHM the same way it is done in Determining the FWHM from a distribution in R but all my attempts have failed.

Any suggestions?

Thank you and I hope this post is clear. Apologies if not, I'm an overflow gumby.

Will

2 Answers2

0

You may be overthinking this. lambda and (1-lambda) give estimates of the proportions of your data belonging to each mode.

Robert Davy
  • 866
  • 5
  • 13
0

I see yes, especially if I'm interested in reporting a ratio. I will do that thanks.

However I would still be curious as to how to do what I originally intended to for potential future purposes :--).