i'm trying to find a way to have python count a specific subset of a string that is different from the usual str.count("X")
Here is my general example. My variable is dna="AAAAA"
My goal is to recognize each set of "AA" that exist in the string.
When I run dna.count("AA")
I get the predictable result of 2 when I print.
However, the result I am looking for is an output of 4. Here is an image to show visually what I am saying. (I would insert the image, but I do not have the 10 reputation required, so I must post a link) https://docs.google.com/drawings/d/16IGo3hIstcNEqVid8BI6uj09KX4MWWAzSuQcu8AjSu0/edit?usp=sharing
I have been unable to find a satisfactory solution to this problem elsewhere. Probably because i'm not sure what to call my problem. EDIT: I was informed this is counting overlapping substrings.
The matter becomes more complicated, as the full program will not have a single repeated letter in the string, but rather 4 letters (ATCG) repeated at random for undetermined lengths. Here is an example dna="AtcGTgaTgctagcg"
I would need the script to out put how many pairs of AT, TC,CG,TG, etc. that exist. While moving one letter incrementally to the right.
Thank you.