4

I am running a simple piece of code to return me batch number of integers (without replacement) between 0 and 2**63.

from random import sample

batch = 1000
n = 63
rand_nos = sample(range(2**n), batch)

I get the error

Python int too large to convert to C ssize_t

I assume this has got to do with the random.sample function internally converting the length of the range to an int value. But I could not find any parameter that I could set, to ask the function to use a data type with larger range. How can I go about this? Thanks.

learner
  • 3,168
  • 3
  • 18
  • 35
  • those are rather large random numbers. What are you using them for? – Mitch Wheat Jan 21 '21 at 09:32
  • I'd rather believe that exception caused by *numpy* – Olvin Roght Jan 21 '21 at 09:32
  • @MitchWheat I am working on coding theory and have to deal with large codes. This is the dimension of the code that I am working on. – learner Jan 21 '21 at 09:36
  • @OlvinRoght the error exists even without the `np.array` function outside. I'll make it clear by removing the `np` function. – learner Jan 21 '21 at 09:36
  • 1
    It can be reproduced by `len(range(2**63))` – Marc Jan 21 '21 at 09:39
  • https://stackoverflow.com/questions/10012534/how-to-generate-a-big-random-number-in-python – Mitch Wheat Jan 21 '21 at 09:40
  • 2
    @learner, check [docs](https://docs.python.org/3/library/stdtypes.html#range) of `range()`: *"Ranges containing absolute values larger than [`sys.maxsize`](https://docs.python.org/3/library/sys.html#sys.maxsize) are permitted but some features (such as [`len()`](https://docs.python.org/3/library/functions.html#len)) may raise [`OverflowError`](https://docs.python.org/3/library/exceptions.html#OverflowError)."* – Olvin Roght Jan 21 '21 at 09:41
  • @learner, you can use `randrange()` in list comp as alternative: `rand_nos = [randrange(2**n) for _ in range(batch)]` – Olvin Roght Jan 21 '21 at 09:45
  • @learner Hope you find some solution in that answer:- https://stackoverflow.com/a/38314163/3607051 – Raksha Saini Jan 21 '21 at 09:47
  • Do you really need [0...2^n] interval? Not [0...2^n) aka [0...2^n-1] ? – Severin Pappadeux Jan 21 '21 at 15:31
  • @SeverinPappadeux if you're referring to (2^n) -1 as the upper limit, then yes that would be fine. But 2^(n-1) is not okay. Thanks – learner Jan 22 '21 at 05:53

1 Answers1

0

OK, you could try to use Linear Congruential Generator, which has some interesting properties - with good choice of constants it will cover whole range [0...2n) once without repetition before it hits its period. Thus, it is basically equivalent to sampling without replacement.

Code for n=64, if you need exactly 63, I could dig out set of constant, just ask.

Python 3.9.1, Win10 x64

import numpy as np

class LCG(object):

    UZERO: np.uint64 = np.uint64(0)
    UONE : np.uint64 = np.uint64(1)

    def __init__(self, seed: np.uint64, a: np.uint64, c: np.uint64) -> None:
        self._seed: np.uint64 = np.uint64(seed)
        self._a   : np.uint64 = np.uint64(a)
        self._c   : np.uint64 = np.uint64(c)

    def next(self) -> np.uint64:
        self._seed = self._a * self._seed + self._c
        return self._seed

    def seed(self) -> np.uint64:
        return self._seed

    def set_seed(self, seed: np.uint64) -> np.uint64:
        self._seed = seed

    def skip(self, ns: np.int64) -> None:
        """
        Signed argument - skip forward as well as backward

        The algorithm here to determine the parameters used to skip ahead is
        described in the paper F. Brown, "Random Number Generation with Arbitrary Stride,"
        Trans. Am. Nucl. Soc. (Nov. 1994). This algorithm is able to skip ahead in
        O(log2(N)) operations instead of O(N). It computes parameters
        A and C which can then be used to find x_N = A*x_0 + C mod 2^M.
        """

        nskip: np.uint64 = np.uint64(ns)

        a: np.uint64 = self._a
        c: np.uint64 = self._c

        a_next: np.uint64 = LCG.UONE
        c_next: np.uint64 = LCG.UZERO

        while nskip > LCG.UZERO:
            if (nskip & LCG.UONE) != LCG.UZERO:
                a_next = a_next * a
                c_next = c_next * a + c

            c = (a + LCG.UONE) * c
            a = a * a

            nskip = nskip >> LCG.UONE

        self._seed = a_next * self._seed + c_next


#%%
np.seterr(over='ignore')

seed = np.uint64(1)

rng64 = LCG(seed, np.uint64(6364136223846793005), np.uint64(1))

print(rng64.next())
print(rng64.next())
print(rng64.next())

#%%
rng64.skip(-3) # back by 3
print(rng64.next())
print(rng64.next())
print(rng64.next())

rng64.skip(-3) # back by 3
rng64.skip(2) # forward by 2
print(rng64.next())
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64