To efficiently get the frequencies of letters (given alphabet ABC
in a dictionary in a string code
I can make a function a-la (Python 3) :
def freq(code):
return{n: code.count(n)/float(len(code)) for n in 'ABC'}
Then
code='ABBBC'
freq(code)
Gives me
{'A': 0.2, 'C': 0.2, 'B': 0.6}
But how can I get the frequencies for each position along a list of strings of unequal lengths ? For instance mcode=['AAB', 'AA', 'ABC', '']
should give me a nested structure like a list of dict (where each dict is the frequency per position):
[{'A': 1.0, 'C': 0.0, 'B': 0.0},
{'A': 0.66, 'C': 0.0, 'B': 0.33},
{'A': 0.0, 'C': 0.5, 'B': 0.5}]
I cannot figure out how to do the frequencies per position across all strings, and wrap this in a list comprehension. Inspired by other SO for word counts e.g. the well discussed post Python: count frequency of words in a list I believed maybe the Counter module from collections
might be a help.
Understand it like this - write the mcode strings on separate lines:
AAB
AA
ABC
Then what I need is the column-wise frequencies (AAA, AAB, BC) of the alphabet ABC in a list of dict where each list element is the frequencies of ABC per columns.