I often come across the need for "shuffling" a python list in a manner similar to using the sorted
function, i.e. return a shuffled copy of the list without mutating the original.
I know one can shallow-copy the list and then shuffle
:
a = [random.randint(1, 1000) for _ in xrange(100)]
b = a[:]
random.shuffle(b)
The functional programmer in me wants a solution that is not mutative (and fitting into a single readable line is a big bonus), and so I want to use the sample
function:
a = [random.randint(1, 1000) for _ in xrange(100)]
b = random.sample(a, len(a))
However, nowhere in the documentation can I find clues as to whether this is a good idea in terms of costs, both time- and space-wise, asymptotic or constant factor.
The post What is the difference between random.sample and random.shuffle in Python appears to be asking a similar if broader question, but a) does not have an accepted answer, and more importantly, b) none of the answers provide any mention of what the relative merits are in terms of time/space.
All 4 answers apart from the most-voted answer specifically talk about output, rather than any algorithmic complexity.
The Most Voted answer has the python source code for the two functions, and concludes:
the randomization is essentially done by the line
int(random() * n)
. So, the underlying algorithm is essentially the same
I think this would suggest that output is the same, but in terms of time complexity, it doesn't really address the issue as the surrounding code is significantly different.
So my question is: what is the time/space cost difference, preferably both asymptotic and in terms of constant factors, between random.shuffle
on a shallow copy, and random.sample(x, len(x))
?
NB I recognize the title is slightly inaccurate because it doesn't make explicit the fact that I want a comparison of random.shuffle(x)
where x is already a shallow copy. This is purely because of a need for brevity.