TL;DR
Each result is generated with equal probability and run with O(n)
time complexity:
import random
_prob_cache = [1, 1]
def prob(n):
try:
return _prob_cache[n]
except IndexError:
pass
for i in range(len(_prob_cache) - 1, n):
_prob_cache.append(1 / (i * _prob_cache[-1] + 1))
return _prob_cache[-1]
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
idx = int((rnd - diag_prob) / nondiag_prob)
remain[idx], remain[-1] = remain[-1], remain[idx]
col = remain.pop()
res[row, col] = res[col, row] = 1
return res
Long Answer
Begin with some derivation:
Let f(n)
be the number of all setting schemes of n * n
matrix. Obviously, we have:
f(1) = 1
Then take a convention:
f(0) = 1
For n > 1
, I can extract a position from any row and set it to 1. There are two cases:
- If 1 is on the diagonal, we can remove the row and column of this 1 and continue to set on the remaining
(n - 1) * (n - 1)
matrix, so the number of remaining setting schemes is f(n - 1)
.
- If 1 is not on the diagonal, the symmetrical part also needs to be set to 1. Then we can remove the row and column where the two 1's are located. We need to continue to set the remaining
(n - 2) * (n - 2)
matrix. Therefore, the number of remaining setting schemes is f(n - 2)
.
Therefore, we can deduce:
f(n) = f(n - 1) + (n - 1) * f(n - 2)
According to the above strategy, if we want to make every setting scheme appear with equal probability, we should give different weights to diagonal index and other indices when selecting index. The weight of the diagonal index should be:
p(n) = f(n - 1) / f(n)
Therefore:
f(n) = f(n - 1) + (n - 1) * f(n - 2)
f(n) (n - 1) * f(n - 2)
=> -------- = 1 + ------------------
f(n - 1) f(n - 1)
1
=> ---- = 1 + (n - 1) * p(n - 1)
p(n)
1
=> p(n) = ------------------
(n - 1) * p(n - 1)
The probability function code is as follows:
_prob_cache = [1, 1]
def prob(n):
"""
Iterative version to prevent stack overflow caused by recursion.
Old version:
@lru_cache
def prob(n):
if n == 1:
return 1
else:
return 1 / ((n - 1) * prob(n - 1) + 1)
"""
try:
return _prob_cache[n]
except IndexError:
pass
for i in range(len(_cache) - 1, n):
_prob_cache.append(1 / (i * _prob_cache[-1] + 1))
return _prob_cache[-1]
The weight of the non diagonal index is:
f(n - 2) f(n - 2) f(n - 1)
-------- = -------- * -------- = p(n - 1) * p(n)
f(n) f(n - 1) f(n)
or
f(n - 2) 1 - p(n)
-------- = --------
f(n) n - 1
Here I choose to use the latter to call the function less once.
Specific implementation:
We use a list to store the indices that can still be used. In each loop, we take the last element of the list as the row index (unlike previously said to select the first element, which can speed up the removal of elements from the list), calculate the weight of the two cases and obtain the column index randomly, sets the value of the corresponding position, and removes the used index from the list until the list is empty:
import random
import numpy as np
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
col = remain.pop(int((rnd - diag_prob) / nondiag_prob))
res[row, col] = res[col, row] = 1
return res
Optimize to O(n) time complexity:
If we do not consider the creation of the zero matrix, the time complexity of the above policy is O(n^2)
, because every time we have a high probability of removing an index from the list.
However, violent removal is unnecessary. We have no requirements on the order of the remaining indices because the selection of row index does not affect the randomness of column index. Therefore, a cheaper solution is to overwrite the selected column index with the last element, and then remove the last element. This makes the O(n)
operation of removing intermediate elements become O(1)
operation, so the time complexity becomes O(n)
:
def symmetric_permutation(n):
res = np.zeros((n, n), int)
remain = list(range(n))
while remain:
m = len(remain)
diag_prob = prob(m)
row = remain.pop()
rnd = random.random()
if rnd < diag_prob:
col = row
else:
nondiag_prob = (1 - diag_prob) / (m - 1)
idx = int((rnd - diag_prob) / nondiag_prob)
remain[idx], remain[-1] = remain[-1], remain[idx]
col = remain.pop()
res[row, col] = res[col, row] = 1
return res
Probability test:
Here we prepare another function to calculate f(n)
for the following test:
def f(n):
before_prev, prev = 1, 1
for i in range(1, n):
before_prev, prev = prev, prev + before_prev * i
return prev
Next is a probability test to verify whether the results are uniform enough. Here I take n=8
and build matrix 500_000
times, use the column index of 1 in each row as the identification of each result, and draw a line graph and histogram of the number of occurrences of each result:
from collections import Counter
import matplotlib.pyplot as plt
random.seed(0)
n = 8
times = 500_000
n_bin = 30
cntr = Counter()
cntr.update(tuple(symmetric_permutation(n).nonzero()[1]) for _ in range(times))
assert len(cntr) == f(n)
plt.subplot(2, 1, 1).plot(cntr.values())
plt.subplot(2, 1, 2).hist(cntr.values(), n_bin)
plt.show()
It can be seen from sub figure 1 that the number of occurrences of each result is roughly within the range of 650 ± 70, and it can be observed from sub figure 2 that the distribution of the number of occurrences of each result is close to the Gaussian distribution:

For @AndrzejO 's answer, the same code test is used here, and his solution is faster (after optimization, the speed of the two is almost the same now), but the probability of each result does not seem equal (note that various results also appear here):
