Associating list value in python dictionary with relevant key

Question

I have a 2-column tab-separated input that I would like to populate a dictionary in python. The first column associates to the key (there are duplicates) and the second column associates to the value.

Sample input:

cat tail
cat whisker
cat meow
cat black
dog tail
dog paw
dog bark
bird    beak

I have written the following code, which produces an (albeit wrong) output that contains the dictionary format that I am looking for, which associates one key from col1 to all of its values in col2.

The code that I have been using is:

#!/usr/bin/python
# -*- coding: utf-8 -*-

keys = []
values = []

with open('animal-trial', "rU") as f:
    for line in f:
        line = line.split()
        keys.append(line[0])
        values.append(line[1])
    d = {}
    for k,v in zip(keys, values):
        d.setdefault(k, []).append(v)
    print d

I have looked up other references [HERE], [HERE] and [HERE], however, all of the suggestions, including with defaultdicts bring me to the same output, rather than the desired output.

The actual output is:

{'cat': ['tail']}
{'cat': ['tail', 'whisker']}
{'cat': ['tail', 'whisker', 'meow']}
{'cat': ['tail', 'whisker', 'meow', 'black']}
{'dog': ['tail'], 'cat': ['tail', 'whisker', 'meow', 'black']}
{'dog': ['tail', 'paw'], 'cat': ['tail', 'whisker', 'meow', 'black']}
{'dog': ['tail', 'paw', 'bark'], 'cat': ['tail', 'whisker', 'meow', 'black']}
{'bird': ['beak'], 'dog': ['tail', 'paw', 'bark'], 'cat': ['tail', 'whisker', 'meow', 'black']}

The desired output is

{'bird': ['beak'], 'dog': ['tail', 'paw', 'bark'], 'cat': ['tail', 'whisker', 'meow', 'black']}

Can anyone point me to where I am making an error or have a more comprehensive solution so that the final result is one dictionary?

You *do* have one dictionary. Your indentation must be different to what you have posted; your final print statement is probably inside the for loop. — Daniel Roseman, Jan 27 '17 at 15:12
@owwoow14 Thats right your code works I guess. Indent back the statement `print d` — Mohammad Yusuf, Jan 27 '17 at 15:32

Mohammad Yusuf · Accepted Answer · 2017-01-27T15:23:47.477

2

You can check if the key is present, if it's present then append and if it's not then create a list with single element:

d = {}
with open('a12', 'r') as f:
    for line in f:
        if line.strip():
            a = line.split()
            if a[0] not in d:
                d[a[0]] = [a[1]]
            else:
                d[a[0]].append(a[1])
print d

Output:

{'cat': ['tail', 'whisker', 'meow', 'black'], 'bird': ['beak'], 'dog': ['tail', 'paw', 'bark']}

With pandas:

import pandas as pd

df = pd.read_csv('file_name', header=None, sep='\s+')
print df.groupby(0)[1].apply(list).to_dict()

Output:

{'dog': ['tail', 'paw', 'bark'], 'bird': ['beak'], 'cat': ['tail', 'whisker', 'meow', 'black']}

edited Jan 27 '17 at 15:23

answered Jan 27 '17 at 15:14

Mohammad Yusuf

16,554
10
50
78

why not using a regular dict? – Jan 27 '17 at 15:15
@SembeiNorimaki OK? – Mohammad Yusuf Jan 27 '17 at 15:18

score 1 · Answer 2 · answered Jan 27 '17 at 15:17

I assume you have an input file called f_input.txt.

You can also use groupbyfrom itertools module like this example:

from itertools import groupby

data = list(k.rstrip().split() for k in open("f_input.txt", 'r'))
final = {}
for k, v in groupby(data, lambda x : x[0]):
    final[k] = list(k[1] for k in list(v))

print(final)

Output:

{'bird': ['beak'], 'dog': ['tail', 'paw', 'bark'], 'cat': ['tail', 'whisker', 'meow', 'black']}

score 0 · Answer 3 · answered Jan 27 '17 at 15:24

lets suppose you have split your input on "\n"

 d = {}
 tab = ['cat tail', 'cat whisker', 'cat meow', 'cat black', 'dog tail', 'dog paw', 'dog bark', 'bird beak']
 for i  in tab:
    try:
        d[i.split(" ")[0]] += [i.split(" ")[1]]
    except KeyError:
        d[i.split(" ")[0]] = [i.split(" ")[1]]

output

{'bird': ['beak'], 'dog': ['tail', 'paw', 'bark'], 'cat': ['tail', 'whisker', 'meow', 'black']}

score 0 · Answer 4 · answered Jan 27 '17 at 16:40

This can be solved with default defaultdict

Code:

from collections import defaultdict

def main():
    keys = []
    values = []

    with open('animal-trial', "rU") as f:
        for line in f:
            line = line.split()
            keys.append(line[0])
            values.append(line[1])
        d = defaultdict(list)
        for k,v in zip(keys, values):
            d[k].append(v)
        print(dict(d))

if __name__ == "__main__": main()

Output:

{'cat': ['tail', 'whisker', 'meow', 'black'], 'bird': ['beak'], 'dog': ['tail', 'paw', 'bark']}

Associating list value in python dictionary with relevant key

4 Answers4

With pandas: