1

so I have a defaultdict(list) hashmap, potential_terms

potential_terms={9: ['leather'], 10: ['type', 'polyester'], 13:['hello','bye']}

What I want to output is the 2 values (words) with the lowest keys, so 'leather' is definitely the first output, but 'type' and 'polyester' both have k=10, when the key is the same, I want a random choice either 'type' or 'polyester'

What I did is:

out=[v for k,v in sorted(potential_terms.items(), key=lambda x:(x[0],random.choice(x[1])))][:2]

but when I print out I get :

[['leather'], ['type', 'polyester']]

My guess is ofcourse the 2nd part of the lambda function: random.choice(x[1]). Any ideas on how to make it work as expected by outputting either 'type' or 'polyester' ?

Thanks

IS92
  • 690
  • 1
  • 13
  • 28

3 Answers3

5

EDIT: See Karl's answer and comment as to why this solution isn't correct for OP's problem. I leave it here because it does demonstrate what OP originally got wrong.

key= doesn't transform the data itself, it only tells sorted how to sort,

you want to apply choice on v when selecting it for the comprehension, like so:

out=[random.choice(v) for k,v in sorted(potential_terms.items())[:2]]

(I also moved the [:2] inside, to shorten the list before the comprehension)

Output:

['leather', 'type']

OR

['leather', 'polyester']
Adam.Er8
  • 12,675
  • 3
  • 26
  • 38
  • This doesn't give the expected result as described. Rather than choosing the two words *with* the lowest values, breaking ties randomly, it will randomly choose a word for each of the two lowest values. The result will be different when the lowest key in the dictionary has more than one word in its value (it should select both words from that list, but won't with this approach). It will also break if any key has an empty list. – Karl Knechtel Oct 27 '21 at 14:12
  • @KarlKnechtel Oh I see, correct. great answer! – Adam.Er8 Oct 27 '21 at 14:14
2

You have (with some extra formatting to highlight the structure):

out = [
    v
    for k, v in sorted(
        potential_terms.items(),
        key=lambda x:(x[0], random.choice(x[1]))
    )
][:2]

This means (reading from the inside out): sort the items according to the key, breaking ties using a random choice from the value list. Extract the values (which are lists) from those sorted items into a list (of lists). Finally, get the first two items of that list of lists.

This doesn't match the problem description, and is also somewhat nonsensical: since the keys are, well, keys, there cannot be duplicates, and thus there cannot be ties to break.

What we wanted: sort the items according to the key, then put all the contents of those individual lists next to each other to make a flattened list of strings, but randomizing the order within each sublist (i.e., shuffling those sublists). Then, get the first two items of that list of strings.

Thus, applying the technique from the link, and shuffling the sublists "inline" as they are discovered by the comprehension:

out = [
    term
    for k, v in sorted(
        potential_terms.items(),
        key = lambda x:x[0] # this is not actually necessary now,
        # since the natural sort order of the items will work.
    )
    for term in random.sample(v, len(v))
][:2]

Please also see https://treyhunner.com/2015/12/python-list-comprehensions-now-in-color/ to understand how the list flattening and result ordering works in a two-level comprehension like this.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
  • Thank you so much for your answer. I just changed 1 thing: out = [term for k, v in sorted(potential_terms.items())for term in random.choice(v)][:2] --> replaced shuffle with choice as shuffle doesn't return anything it shuffles in place,so I was getting ( TypeError: 'NoneType' object is not iterable). I also as you said removed lambda, since the default sort is by keys – IS92 Oct 28 '21 at 07:42
  • 1
    `shuffle` is in-place, but `choice` only gives you one element from each sub-list - the problem I pointed out on Adam's answer. Fortunately, this can still be fixed - see the edit. – Karl Knechtel Oct 29 '21 at 10:21
1

Instead of the out, a simpler function, is: d = list(p.values()) which stores all the values. It will store the values as:

[['leather'], ['polyester', 'type'], ['hello', 'bye']]

You can access, leather as d[0] and the list, ['polyester', 'type'], as d[1]. Now we'll just use random.shuffle(d[1]), and use d[1][0]. Which would get us a random word, type or polyester.

Final code should be like this:

import random
potential_terms={9: ['leather'], 10: ['type', 'polyester'], 13:['hello','bye']}
d = list(p.values())
random.shuffle(d[1])
c = []
c.append(d[0][0])
c.append(d[1][0])

Which gives the desired output, either ['leather', 'polyester'] or ['leather', 'type'].