9

DISCLAIMER: I know there's a question named

Get a random sample of a dict

but mine is not a duplicate, clearly. The answers to that question mostly concentrate on computing the sum of the values a random subset of a dictionary, because that's what the OP really wanted. Instead, I really need to extract a subset.

I have a very large dictionary, and I want to extract a subsample, on which I then want to iterate. I tried:

import random
dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
keys = random.sample(dictionary, 3)
sample = dictionary[keys]

But it doesn't work:

Traceback (most recent call last):
  File "[..]/foobar.py", line 4, in <module>
    sample = dictionary[keys]
TypeError: unhashable type: 'list'

This works:

import random
dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5}
keys = random.sample(dictionary, 3)
sample = {key: dictionary[key] for key in keys}

It seems a bit word-ish: I hoped there would be a vectorized way to build the new dictionary. However, is this the right/most Pythonic way to do it? Also, if I want to iterate on this sample, should I do like this:

for key, value in sample.iteritems():
    print(key, value)

My question is not a duplicate of

how to randomly choose multiple keys and its value in a dictionary python

either, because the answer to that question doesn't full address my question. It's even worse than my attempt: instead than creating a sample dictionary, it samples the keys and then retrieves the values separately. It's obviously not very pythonic, and I explicitly asked for a pythonic answer.

DeltaIV
  • 4,773
  • 12
  • 39
  • 86
  • I don't understand how this isn't a duplicate. What does it matter what you do with the random subset? the process of deriving is the same. – erik258 Nov 02 '18 at 19:49
  • 2
    How about `dict(random.sample(dictionary.items(), 3))`? – timgeb Nov 02 '18 at 19:49
  • That is the vectorized way. – Alex Reynolds Nov 02 '18 at 19:50
  • @DanFarrell it's not a duplicate of the one I included in my question, which is the one I could find with Google. All the answers to that one, concentrated on extracting the _values_, but I also wanted the keys, so they didn't fully address my question. However, it's probably a duplicate of the other one, which I couldn't find. I'll read it and check if mine is a real duplicate. – DeltaIV Nov 02 '18 at 20:00
  • @timgeb your suggestion works perfectly. If you post it as an answer, I will accept it, though I don't know if you can answer a question which has been closed as a duplicate. – DeltaIV Nov 02 '18 at 20:08
  • @DanFarrell ok, I checked and it's not a duplicate: please see my edit to the question. What I mean when I say "it's not a duplicate", it's more precisely that the existing answers "do not fully address my question". According to these site rules, in this case asking a new question is fine. – DeltaIV Nov 02 '18 at 20:21
  • 1
    @DeltaIV I can't. – timgeb Nov 02 '18 at 20:23
  • @timgeb heh. I suspected that. It's a pity, because unlike the answers to the other questions, yours is the only one which really addresses the question. – DeltaIV Nov 02 '18 at 20:24
  • 2
    @DeltaIV maybe you get a reopen. I won't reopen it myself because that might look sketchy. – timgeb Nov 02 '18 at 20:29
  • 1
    I'm willing to vote for a reopen ( for what it's worth this was closed before I even got a chance to close it) – erik258 Nov 04 '18 at 16:07
  • 1
    @timgeb the question has been reopened! Please post your comment as an answer, and I'll accept it right away. I'm actually using it in my code. – DeltaIV Nov 05 '18 at 22:39

1 Answers1

13

With

dict(random.sample(dictionary.items(), N))

you can select N random (key, value) pairs from your dictionary and pass them to the dict constructor.

Demo:

>>> import random
>>> dictionary = dict(enumerate(range(10)))
>>> dictionary
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>> N = 3
>>> dict(random.sample(dictionary.items(), N))
{3: 3, 6: 6, 9: 9}
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • 2
    This Q&A is a bit old and this now raises `DeprecationWarning: Sampling from a set deprecated since Python 3.9 and will be removed in a subsequent version.` The easiest fix would be to wrap with a `list(...)` call – Tomerikoo Jul 17 '22 at 09:00