I have the following problem:
- I have 2 Strings of DNA Sequences (consisting of ACGT), which differ in one or two spots.
- Finding the differences is trivial, so let's just ignore that
- for each difference, I want to get the consensus symbol (e.g. M for A or C) that represents both possibilities
I know I could just make a huge if-cascade but I guess that's not only ugly and hard to maintain, but also slow.
What is a fast, easy to maintain way to implement that? Some kind of lookup table perhaps, or a matrix for the combinations? Any code samples would be greatly appreciated. I would have used Biojava, but the current version I am already using does not offer that functionality (or I haven't found it yet...).
Update: there seems to be a bit of confusion here. The consensus symbol is a single char, that stands for a single char in both sequences.
String1 and String2 are, for example "ACGT" and "ACCT" - they mismatch on position 2. Sooo, I want a consensus string to be ACST, because S stands for "either C or G"
I want to make a method like this:
char getConsensus(char a, char b)
Update 2: some of the proposed methods work if I only have 2 sequences. I might need to do several iterations of these "consensifications", so the input alphabet could increase from "ACGT" to "ACGTRYKMSWBDHVN" which would make some of the proposed approaches quite unwieldy to write and maintain.