1

My dict (cpc_docs) has a structure like

{
sym1:[app1, app2, app3],
sym2:[app1, app6, app56, app89],
sym3:[app3, app887]
}

My dict has 15K keys and they are unique strings. Values for each key are a list of app numbers and they can appear as values for more than one key.

I've looked here [Python: Best Way to Exchange Keys with Values in a Dictionary?, but since my value is a list, i get an error unhashable type: list

I've tried the following methods:

res = dict((v,k) for k,v in cpc_docs.items())
for x,y in cpc_docs.items():
    res.setdefault(y,[]).append(x)
new_dict = dict (zip(cpc_docs.values(),cpc_docs.keys()))

None of these work of course since my values are lists.

I want each unique element from the value lists and all of its keys as a list.

Something like this:

{
app1:[sym1, sym2]
app2:[sym1]
app3:[sym1, sym3]
app6:[sym2]
app56:[sym2]
app89:[sym2]
app887:[sym3]
}

A bonus would be to order the new dict based on the len of each value list. So like:

{
app1:[sym1, sym2]
app3:[sym1, sym3]
app2:[sym1]
app6:[sym2]
app56:[sym2]
app89:[sym2]
app887:[sym3]
}
Britt
  • 539
  • 1
  • 7
  • 21
  • Hint: first convert `{a: [b, c...],..}` into a list of pairs, like `[(a, b), (a, c),..]`, then build a dict like `{b: [a], c: [a],...}`from that. Each step is easy. – 9000 Jun 20 '19 at 21:19
  • ah, good idea. change one to many to one to one – Britt Jun 20 '19 at 21:19
  • @9000 Any hints on sorting the dict created from the code below by the length of each set? – Britt Jun 21 '19 at 13:41
  • @Britt: While it *is* now possible to give order to dictionaries in Python 3.6+ (as they preserve insertion order), it's generally a bad idea to rely upon it, since it's a new feature in Python. In earlier versions the iteration order of a dictionary was an implementation detail, and could change as more items were added or removed. And for most dictionary operations (indexing), the order does not matter. – Blckknght Jun 21 '19 at 20:22
  • In my use case, the order is important. Now that I have a dict sorted by the length of the value set, I want to search for a particular value and grab all the keys with that value in their value sets. the keys here are documents and the set of values are classifications on that document. So by sorting according to the length of the sets, I am getting more relevant documents at the beginning of the list. – Britt Jun 21 '19 at 21:01

2 Answers2

1

Your setdefault code is almost there, you just need an extra loop over the lists of values:

res = {}

for k, lst in cpc_docs.items():
    for v in lst:
        res.setdefault(v, []).append(k)
Blckknght
  • 100,903
  • 11
  • 120
  • 169
0

First create a list of key, value tuples

new_list=[]
for k,v in cpc_docs.items():
    for i in range(len(v)):
        new_list.append((k,v[i]))

Then for each tuple in the list, add the key if it isn't in the dict and append the

doc_cpc = defaultdict(set)

for tup in cpc_doc_list:
    doc_cpc[tup[1]].add(tup[0])

Probably many better ways, but this works.

Community
  • 1
  • 1
Britt
  • 539
  • 1
  • 7
  • 21