28

I have a dictionary like this:

user_dict = {
            user1: [(video1, 10),(video2,20),(video3,1)]
            user2: [(video1, 4),(video2,8),(video6,45)]
            ...
            user100: [(video1, 46),(video2,34),(video6,4)]                 
            } 

(video1,10) means (videoid, number of request)

Now I want to randomly choose 10 users and do some calculation like

 1. calculate number of videoid for each user. 
 2. sum up the number of requests for these 10 random users, etc

then I need to increase the random number to 20, 30, 40 respectively

But "random.choice" can only choose one value at a time, right? how to choose multiple keys and the list following each key?

manxing
  • 3,165
  • 12
  • 45
  • 56

4 Answers4

48

That's what random.sample() is for:

Return a k length list of unique elements chosen from the population sequence. Used for random sampling without replacement.

This can be used to choose the keys. The values can subsequently be retrieved by normal dictionary lookup:

>>> d = dict.fromkeys(range(100))
>>> keys = random.sample(list(d), 10)
>>> keys
[52, 3, 10, 92, 86, 42, 99, 73, 56, 23]
>>> values = [d[k] for k in keys]

Alternatively, you can directly sample from d.items().

Haha TTpro
  • 5,137
  • 6
  • 45
  • 71
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 5
    If you wanted to get the keys and values, you could just use ``random.sample()`` on ``dict.items()``, rather than getting a key then doing a lookup. – Gareth Latty Apr 12 '12 at 14:29
  • 1
    you need list(d) or you will get this error raise TypeError("Population must be a sequence or set. For dicts, use list(d).") TypeError: Population must be a sequence or set. For dicts, use list(d). – Haha TTpro Apr 21 '16 at 15:34
  • 1
    Note that this appears to have O(n) time complexity (in length of the dict) due to having to do list(d). – zplizzi Dec 24 '19 at 21:11
  • A faster approach is to sample from `dict.keys()`, which you can do without casting to a list. See eg the middle solution in this answer: https://stackoverflow.com/a/40002638/2989201 – zplizzi Dec 24 '19 at 21:14
  • 1
    Actually that is also O(n), the list-construction from `.keys()` iterator just happens internally in `random.sample()` – zplizzi Dec 24 '19 at 21:29
  • @zplizzi I don't think there is a way to make this O(k) for dictionaries. The internal structure used since Python 3.6 might allow this (not sure about this, have to double check), but the Python interface of the dictionary is not powerful enough to actually leverage that internal structure. In older versions of Python, it's not even possible with C code. – Sven Marnach Dec 25 '19 at 21:53
  • 1
    @SvenMarnach yeah you are correct - although I think it's useful for people to know the time complexity of a solution, especially when it's something that seems like it could be constant-time but isn't. For anyone trying to find a simple constant-time solution, check out https://github.com/robtandy/randomdict (and the minor bug fixed in an open PR there). – zplizzi Dec 26 '19 at 16:00
3

If you wanted to return a dictionary, you could use a dictionary comprehension instead of the list comprehension in Sven Marnach's answer like so:

d = dict.fromkeys(range(100))
keys = random.sample(d.keys(), 10)
sample_d = {k: d[k] for k in keys}
jss367
  • 4,759
  • 14
  • 54
  • 76
1

I have work on this problem,

import random

def random_a_dict_and_sample_it( a_dictionary , a_number ): 
    _ = {}
    for k1 in random.sample( list( a_dictionary.keys() ) , a_number ):
        _[ k1 ] = a_dictionary[ k1 ]
    return _

In your case:

user_dict = random_a_dict_and_sample_it( user_dict , 2 )
Gromph
  • 123
  • 12
0

Simply

import random
mydict = {"a":"its a", "b":"its b"}
random.sample(mydict.items(), 1)

# [('b', 'its b')]
# or
# [('a', 'its a')]

TheMechanic
  • 820
  • 1
  • 8
  • 23
Deepak Sharma
  • 1,401
  • 19
  • 21