0

I'm translating an R script into Python, however, I found the encoding result is different between Python and R.

In R:

> digest('0003bf82df1e0255a352b89d431a831d_NA', algo='xxhash32')
[1] "d6865d43"

In Python:

>>> xxhash.xxh32('0003bf82df1e0255a352b89d431a831d_NA').hexdigest()
'3c0493fd

They both use the same algo and both with the default seed = 0. But why is this happening?

Any help would be appreciated!

Mr369
  • 384
  • 4
  • 17

1 Answers1

2

seed=0 will only give the same value stream when passed to the same random number generator on repeated instances.

Passing seed=0 to two different random number generators will give different value streams.

While it is true R's random number generator and Python's random number generator are both by default a Mersenne Twister the underlying implementation is demonstratively different.

Because the random number generation relies on some underlying implementation, R is not even consistent across versions. Python technically isn't either as, prior to Python2.3, it used a different random number generator, but all currently supported versions of python are consistent.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
  • 1
    Good point, but FWIW I believe that it is just the *defaults* that are inconsistent across R versions (note that the link you give is specifically about the behaviour of `sample()`; for example, `RNGversion("2.0")` will set the random-number generation machinery to be identical to that used in R version 2.0. – Ben Bolker Mar 31 '21 at 19:16