Count the freq of each value for each key in a dictionary, and put the count info into dataframe

Question

Currently I have a dictionary like this: {'axin': ['GAUAUG', 'AAAUGC', 'AAAUGC', 'GUCAUG', 'UUUGAU', 'UUUGAU'], 'hedgehog-protein': ['ACUGGC', 'GUCAUG', 'UUUGAU'], 'pangolin': ['GAACAC', 'GAAAGA', 'UUUGAU', 'AAUACU']}.

I want to create a dataframe with each key as a column, and each value as the row name (or vice-versa, doesn't really matter), whilst also counting the occurrences of each value per key. For example the first few lines of output would be this:

        axin hedgehog-protein pangolin 
GAUAUG  1    0                0        
AAAUGC  2    0                0        
GUCAUG  1    1                1
UUUGAU  2    1                1        
ACUGGC  0    1                0

The most similar question I found was Create a pandas dataframe of counts however they had different types of values for each key, and so weren't common across keys. Also, as the length of my value lists are different for each key, ValueError is a problem I keep running into.

score 4 · Accepted Answer · answered Mar 11 '19 at 12:22

4

Use Counter in dictionary comprehension, then replace missing values and convert to integers:

d = {'axin': ['GAUAUG', 'AAAUGC', 'AAAUGC', 'GUCAUG', 'UUUGAU', 'UUUGAU'], 
     'hedgehog-protein': ['ACUGGC', 'GUCAUG', 'UUUGAU'],  
     'pangolin': ['GAACAC', 'GAAAGA', 'UUUGAU', 'AAUACU']}

from collections import Counter

df = pd.DataFrame({k: Counter(v) for k, v in d.items()}).fillna(0).astype(int)
print (df)
        axin  hedgehog-protein  pangolin
AAAUGC     2                 0         0
AAUACU     0                 0         1
ACUGGC     0                 1         0
GAAAGA     0                 0         1
GAACAC     0                 0         1
GAUAUG     1                 0         0
GUCAUG     1                 1         0
UUUGAU     2                 1         1

answered Mar 11 '19 at 12:22

jezrael

822,522
95
1,334
1,252

Thank you, very simple solution! – sian Mar 11 '19 at 12:32
@jezrael how to covert key's as rows and values as columns? – dondapati Mar 11 '19 at 13:57
1

@dondapati - use `df = pd.DataFrame.from_dict({k: Counter(v) for k, v in d.items()}, orient='index').fillna(0).astype(int)` – jezrael Mar 11 '19 at 13:58

Count the freq of each value for each key in a dictionary, and put the count info into dataframe

1 Answers1