On windows, I am using CryptGenRandom
API in C (I thought it would be equivalent to /dev/random
or /dev/urandom
on Linux). To confirm it, I made random files using these with CryptGenRandom
on Windows and read from /dev/urandom
on Linux, and analyze the result using ent
.
The code sample I used to generate the random file using CryptGenRandom
(originally from here):
#include <windows.h>
static void
secure_entropy(void *buf, size_t len)
{
HCRYPTPROV h = 0;
DWORD type = PROV_RSA_FULL;
DWORD flags = CRYPT_VERIFYCONTEXT | CRYPT_SILENT;
if (!CryptAcquireContext(&h, 0, 0, type, flags) ||
!CryptGenRandom(h, len, buf)) {
printf("failed to gather entropy");
abort();
}
CryptReleaseContext(h, 0);
}
void test4()
{
size_t size = 1 << 20;
FILE *tfile = fopen("random_file", "w");
char *buf = malloc(size);
secure_entropy(buf, size);
fwrite(buf, 1, size, tfile);
fclose(tfile);
free(buf);
}
However, ent
shows me that the Arithmetic Mean of the random result is around 127.05 instead of 127.5 (as on Linux). I am confident that this is not an incident since I reproduce it several times on different computers and the result is consistent. To further investigate it, I wrote a python script to analyze the frequency of each number (from 0 to 255).
f = open("random_file", "rb")
a = f.read()
f.close()
tmp = [0 for _ in range(256)]
for x in a:
tmp[int(x)] += 1
print(tmp)
The result looks similar to this:
[4101, 4026, 4027, 4074, 4200, 4021, 4121, 4066, 4035, 3972, 4127, 4010,
3978, 8214, 4009, 4155, 4083, 4065, 4067, 4064, 3993, 4021, 4136, 4112, 4221,
4172, 4134, 4117, 3972, 4127, 4175, 4110, 4125, 4181, 4092, 4157, 4122, 4024,
4020, 4088, 3980, 4140, 4159, 4129, 4064, 4141, 4096, 4238, 4036, 4080, 4151,
4115, 4086, 4156, 4111, 4106, 4086, 4058, 4179, 4193, 4144, 4206, 4180, 4028,
4148, 4015, 3979, 4201, 4098, 4146, 4169, 4120, 4044, 4066, 4049, 4051, 4051,
4122, 4048, 4139, 4125, 4052, 4224, 4091, 4084, 4040, 4183, 4134, 3948, 4132,
3955, 4162, 4183, 4014, 4100, 4091, 4005, 4146, 4182, 4032, 4037, 3985, 4098,
4078, 4147, 4060, 4085, 4215, 4039, 4187, 4207, 4161, 4086, 4159, 4018, 4073,
4051, 4008, 4095, 4110, 4160, 4288, 4077, 4074, 4113, 4104, 4097, 4115, 4049,
3963, 4083, 4111, 4066, 4084, 4107, 4035, 3977, 4078, 4035, 4008, 3993, 4080,
4152, 4121, 4111, 4033, 4094, 4191, 4131, 3978, 4082, 4134, 4119, 4135, 4071,
3993, 3888, 4137, 4188, 4110, 4078, 4186, 4188, 4074, 4196, 4110, 4069, 4135,
4043, 4150, 4023, 4095, 4074, 4179, 4112, 4084, 4124, 4180, 4154, 3996, 4103,
4199, 4137, 4155, 4039, 4077, 4159, 4167, 4171, 4115, 4025, 4218, 4046, 4008,
4178, 3969, 4135, 4077, 4044, 4080, 4085, 4230, 4161, 4151, 4056, 4222, 4033,
4020, 4187, 4034, 4175, 4167, 3962, 4102, 4054, 3978, 4111, 4001, 4028, 4103,
4088, 4054, 4049, 4164, 4136, 4110, 4181, 3964, 4098, 4046, 3997, 4151, 4122,
4272, 4067, 4112, 4037, 4083, 4072, 4106, 4105, 4104, 4166, 4090, 4071, 4080,
4070, 4087, 4162, 4060, 4237, 4061, 4044, 4128, 4051, 4097]
in which it is clear that 13 (the 14th number) is approximately twice as likely to occur than all the rest of numbers, this would explain the Arithmetic Mean of 127.05 as well.
I am not sure whether it is a bug of CryptGenRandom
or I incorrectly implement it, but I have tested it on both my 64-bit Windows 10 and 32-bit Windows 7 computer, and the result is consistent. So anyone has any idea or could help further investigate and confirm it?