Best way to store seed vectors from pytorch?

Question

I like to make a pytorch calculation reproducible without storing a large random vector for each step. I tried to first generate a random seed and then re-seed the random number generator like this:

    seed = torch.rand(1, dtype=torch.float64)
    torch.manual_seed(seed) # re-seed, so we get the same vector as we would get when using a stored seed
    torch.save(seed, "seedfile") # store the seed
    myvector = torch.randn(myvector.shape)

This way I would only need to store a float to reproduce the result. But when I use this in a loop, I get always the same result inside the loop.

Explanation what I try to achieve: Let's say I generate a batch of images in a loop. Each image depends on an initialization vector. Now I can reproduce the image by storing the initialization vector and loading it when I want to re-do the calculation (e.g. with other hyper-parameters). But when the vector is random anyway, it is sufficient to store the random seed.

To do so, I currently generate a random seed (a float64 in that code) and then manually seed with it. The manual_seed is not useful in the first run, but should not be a problem either. When I want to reproduce the image, I do not generate the manual seed with torch.rand, but load the seed from a file. This way I need less than 1 kb (with torch.save which has some overhead, the actual data I need to store would be just 8 byte) instead of, e.g., 64 kb for storing the vector that is generated by

loaded_seed = torch.load("seedfile")
torch.manual_seed(loaded_seed)
myvector = torch.randn(myvector.shape)

All you need to do is set the seed manually once at the beginning of your script `torch.manual_seed(0)`. Don't seed with a random, since this is itself generated with some seed that you don't have control over — DerekG, Aug 31 '22 at 15:59
@DerekG That's not what I want. To be able to resume the calculation, let's say in iteration 20, I want to be able to store the seed of iteration 20, so I can re-seed with `iteration_20_seed` instead of `0`. To do so, I generate a random float (It could be int or whatever) and then seed with that float and store the float in a file. When I want to resume from step 20, I load the file and `manual_seed` with the stored seed. — allo, Aug 31 '22 at 16:12
Probably a duplicate of https://stackoverflow.com/questions/55097671/how-to-save-and-load-random-number-generator-state-in-pytorch then — DerekG, Aug 31 '22 at 16:46
@DerekG This looks good. And I think my problem might be, that you cannot seed with a random float, but should use `torch.random.seed()` to generate a seed. I am not sure which of both options is better, as my code uses two seeds that can be manually set when running it the first time. — allo, Aug 31 '22 at 16:52

score 0 · Accepted Answer · answered Aug 31 '22 at 16:53

0

It looks like the problem was (re-)seeding with torch.rand(1) instead of torch.random.seed().

answered Aug 31 '22 at 16:53

allo

3,955
8
40
71

Best way to store seed vectors from pytorch?

1 Answers1