4

Ok, I read all of these before and I think pandas could be a solution, but my problem is slighly different:
Print a dictionary of lists vertically
Printing Lists as Tabular Data
print dictionary values which are inside a list in python
Print a dictionary into a table

I have a dict of lists :

dict={"A":[i1, i2,i3], "B":[i1, i4,i5], "C":[i1, i2,i5]}

What I want as an output is :

    i1    i2    i3    i4    i5   
A    x     x     x     -     -   
B    x     -     -     x     x   
C    x     x     -     -     x  

(or even better,

    i1    i2    i3    i4    i5  
A    A     A     A     -     -  
B    B     -     -     B     B  
C    C     C     -     -     C  

or a value matching A, B, C or (A,in) in another dictionary, but if I can merely have the first table, I'll be more than happy)

No list contains repeats, but every elements in these lists are extracted from a same list (actually my problem is making a grid of annotated terms with the corresponding proteins, the keys being the annotated terms, which are functions related to these proteins in my context of study).

I indeed can think of a convoluted way to do so (building vectors of 0 and 1 for comparison of each list to the general list, associating these vectors with the keys, putting this in a pandas DataFrame which will be well formatted by the magic of me restablishing the good number of entities per list, and print this), but this seems/is tedious/unpythonic.

I think there must be a known way to do that with some module (pandas, prettytable, other?); and that I just don't know it. So I'll be glad for any insight about this. Thanks

Community
  • 1
  • 1
Ando Jurai
  • 1,003
  • 2
  • 14
  • 29

3 Answers3

5

apply with a lambda

d = {
    "A": ['i1', 'i2', 'i3'],
    "B": ['i1', 'i4', 'i5'],
    "C": ['i1', 'i2', 'i5']
}

df = pd.DataFrame(d)

df.apply(lambda c: pd.Series(c.name, c.values)).fillna('-').T

  i1 i2 i3 i4 i5
A  A  A  A  -  -
B  B  -  -  B  B
C  C  C  -  -  C
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Awesome. Thanks. I think I am always disturbed by the way pandas convert lists in dicts to columns. If I understand well, I just have to get c.values instead of c.name to have the value instead? That's great. – Ando Jurai Feb 22 '17 at 11:14
1

Just a simple draft (based on a lot of str.format):

def create_table(dictionary, columns):
    column_set = set(columns)  # only to speed up "in" calls, could be omitted
    # Fill in the symbols depending on the presence of the corresponding columns
    filled_dct = {key: [' X' if col in lst else ' -' for col in column_set] 
                  for key, lst in dct.items()}

    # A template string that is filled for each row
    row_template = '   '.join(['{}']*(len(columns)+1))

    print(row_template.format(*([' '] + columns)))
    for rowname, rowcontent in sorted(filled_dct.items()):
        print(row_template.format(*([rowname] + rowcontent)))

dct = {"A": ['i1', 'i2', 'i3'], 
       "B": ['i1', 'i4', 'i5'], 
       "C": ['i1', 'i2', 'i5']}

columns = ['i1', 'i2', 'i3', 'i4', 'i5']

create_table(dct, columns)
    i1   i2   i3   i4   i5
A    X    X    -    -    X
B    X    -    X    X    -
C    X    X    X    -    -

It's not really flexible (variable column width, etc.) though but should be easily extendable.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Thanks for the answer. this is what I thought about "convoluted ways" (admittedly, it is not as much convoluted as I would have thought, but it is not really the python way to do what I wanted. I'll upvote you for the effort) – Ando Jurai Feb 22 '17 at 11:13
1

Consider your input dictionary:

dic = {"A":["i1", "i2", "i3"], "B":["i1", "i4", "i5"], "C":["i1", "i2", "i5"]}

Use dict.fromkeys() so that the iterable becomes the values present inside dic (a.k.a dic.values()) which is a list and it's default value would be the dic's key (a.k.a dic.keys()).

With the help of a dictionary comprehension, the result computed at the last step would constitute the values of the dataframe. Transpose it so that the column headers become index axis and vice-versa.

Later, fill Nans by "-".

pd.DataFrame({k:dict.fromkeys(v,k) for k,v in dic.items()}).T.fillna("-")
#                               ^----- replace k with "x" to get back the first o/p

enter image description here

Nickil Maveli
  • 29,155
  • 8
  • 82
  • 85
  • Nice answer too, very pythonic and clever. I won't select it as the accepted answer because the process of creating a dict of dicts is slightly more complex to handle conceptually. But I retain the masterful use of it to tell pandas what to do with columns and rows... – Ando Jurai Feb 22 '17 at 11:21