3

I am trying to solve the 'Mendel's First Law' problem on http://rosalind.info/

I have tried several different approaches, but I just can't get my solution to return the same answer as the sample problem on their page. I know their sample output is correct though.

Here is what I have:

traitProb :: Int -> Int -> Int -> Double
traitProb k m n = getProb list
      where list = cartProd genotypes genotypes
            genotypes = (replicate k Dominant) ++ (replicate m Heterozygous) ++ (replicate n Recessive)
            getProb = sum . map ((flip (/)) total . getMultiplier)
            total = fromIntegral $ length list
            getMultiplier (Dominant, Dominant) = 1.0
            getMultiplier (Recessive, Dominant) = 1.0
            getMultiplier (Dominant, Recessive) = 1.0
            getMultiplier (Dominant, Heterozygous) = 1.0
            getMultiplier (Heterozygous, Dominant) = 1.0
            getMultiplier (Heterozygous, Heterozygous) = 0.75
            getMultiplier (Heterozygous, Recessive) = 0.5
            getMultiplier (Recessive, Heterozygous) = 0.5
            getMultiplier (Recessive, Recessive) = 0.0

I am not sure whether the code is wrong, or my method of computing the probability is wrong. Essentially the idea is to get a list of all possible parents, and then based on whether they are Homozygous Dominant, Recessive or Heterozygous, compute the probability of each pair of parents producing a child with at least one dominant allele. Then divide each result by the total number of pairs of parents. After that I just sum the list. But my answer is wrong by a little bit.

Can anyone point me in the right direction?

EDIT: cartProd is the 'cartesian product' of the two lists passed to it, if you will.

cartProd :: [a] -> [a] -> [(a, a)]
cartProd xs ys = [ (x, y) | x <- xs, y <- ys ]
user3468950
  • 63
  • 1
  • 5
  • I think you should split your function up into different separate tasks. That function is doing too much and is not exactly readable, IMHO. `getMultiplier` can also be reduced by matching `0.75`, `0.5` and `0.0` and let everything else be `1.0`: `getMultiplier (_, _) = 1.0`. Oddly I've done the same exercise today. You can find my solution [here](https://github.com/Jefffrey/Solidran/tree/master/Solidran/IPRB), if you need inspiration. :) – Shoe Jun 01 '14 at 00:25
  • We also need to know what `cartProd` does. I guess it's the cartesian product, though. If you could post an [SSCCE](http://sscce.org/) that would increase your change of getting your answer. – Shoe Jun 01 '14 at 00:27
  • Thanks for the reply. I knew when I was writing getMultiplier that I wasn't doing it efficiently but I realize now just how redundant most of the pattern matching is. I will try to change my solution, and if I need to I will peek at yours. – user3468950 Jun 01 '14 at 00:32
  • If you are able to provide an SSCCE (just by providing a compilable version, on [Ideone](http://ideone.com/) or anything else) I'll gladly help you. – Shoe Jun 01 '14 at 00:38
  • How much is "a little bit"? – Code-Apprentice Jun 01 '14 at 00:49
  • Jefffrey: http://ideone.com/GOeDT3, Code-Guru: Enough that I know it's wrong. – user3468950 Jun 01 '14 at 00:52
  • You can probably simplify your approach by calculating the probability of selecting each genotype for each parent rather than explicitly listing all pairs of parents. – Code-Apprentice Jun 01 '14 at 00:53
  • I tried and was sure that I was doing it right, but eventually resorted to doing it like this because I couldn't even get the right answer for the sample input of 2 2 2. – user3468950 Jun 01 '14 at 00:56

1 Answers1

2

I suggest making a slight change in your thinking by doing the calculation in three steps:

  1. What is the probability of getting genotype X for the first parent? (Also, how many different choices are there for X?)

  2. What is the probability of getting genotype Y for the second parent?

  3. Given the genotypes X and Y of the parents, what is the probability of a child displaying the dominant genotype?

Sum steps 1-3 for each (X, Y) pair.

When I drew the tree diagram by hand, I found it easier to calculate the probability of a child NOT having the dominant allele. There are fewer choices to sum and then you can subtract this sum from 1.

Code-Apprentice
  • 81,660
  • 23
  • 145
  • 268
  • Here is the solution I came up with using this approach: `traitProb' k m n = let j = k + n + m a = k + n + m - 1 in ((k/j) * (((k-1)/a) + (n/a) + (m/a))) + ((n/j) * ((k/a) + (m/a/2))) + ((m/j) * ((k/a) + (n/a/2) + ((m-1)/a * 0.75 )))` It is basically a write-only solution, but it works. I tried this originally but couldn't get the right answer for some reason. – user3468950 Jun 01 '14 at 01:15