I am looking for a bijective transform that I can use NOT for encryption, but for obfuscation. The idea is to take integers such as 1, 2, 3, 4
and project them onto another axis. The main idea is to ensure a maximum distance between two adjacent integers from domain A
when projected in domain B
.
I wrote the following in Python, but the target language will be PHP. I don't mind about the implementation, only the algorithm/method matters.
Example for n = 8
:
A B
0 <-> 4
1 <-> 5
2 <-> 6
3 <-> 0
4 <-> 2
5 <-> 7
6 <-> 3
7 <-> 1
One naive implementation would be to build a random one-to-one correspondance table then use as a O(1) transform function.
a = func(0) # a = 4
b = func(a) # b = 0
Most efficient algorithm
The most efficient algorithm consists of building a hash-table which requires O(n) memory and executes in O(1).
class Shuffle:
def __init__(self, n, seed=42):
self._table = list(range(n))
random.seed(seed)
random.shuffle(self._table)
def enc(self, n):
return self._table[n]
def dec(self, n):
return self._table.index(n)
Unfortunately I am looking for a lighter perhaps simpler algorithm.
Dumb attempt
Here comes the fun part, because I have tried to play with One-Time-Pad (OTP), some symmetric encryption algorithms (Blowfish) without success. Eventually I tried to play with a dumb funny naive kind of implementation:
def enigma(input): # Because why not?
a = deque([1, 3, 0, 4, 2])
b = deque([3, 0, 4, 2, 1])
u = [0 for i in range(len(input))]
for i, c in enumerate(input):
u[i] = a.index(b.index([4, 3, 2, 1, 0][b[a[c]]]))
a.rotate(1)
b.rotate(3)
return u
def shuffle(n, seed=81293): # Because enigma is not enough
random.seed(seed)
serie = list(range(len(n)))
random.shuffle(serie)
for i, j in enumerate(serie):
n = n.copy()
n[i], n[j] = n[j], n[i]
return n
def int2five(decimal): # Base 10 -> Base 5
radix = 5
s = [0 for i in range(9)]
i = 0
while (decimal):
s[i] = decimal % radix
decimal = decimal // radix
i += 1
return list(reversed(s))
def five2int(five): # Base 5 -> Base 10
return int(''.join([str(x) for x in five]), 5)
def enc(n):
return five2int(enigma(shuffle(int2five(n))))
def dec(s):
return five2int(shuffle(enigma(int2five(s))))
With as result:
>>> ['%d <-> %d' % (i, enc(i)) for i in range(7)]
['0 <-> 1729928',
'1 <-> 558053',
'2 <-> 1339303',
'3 <-> 948678',
'4 <-> 167428',
'5 <-> 1729943',
'6 <-> 558068']
Why did I choose base 5? I choose my work range to be [0..2e6[
, so by looking at the best base to work with that maximise the coverage (yeah that following code is ugly):
>>> [(d, l, s, 100 - round(s/n*100,1)) for (d, l, s) in sorted([(k, round(math.log(n, k), 3), n - k**(math.floor(math.log(n, k))) - 1)
for k in range(2, 100)], key=lambda k: k[2])[0:10]]
[(5, 9.015, 46873, 97.7),
(18, 5.02, 110430, 94.5),
(37, 4.018, 125837, 93.7),
(11, 6.051, 228437, 88.6),
(6, 8.097, 320382, 84.0),...]
I notice that base 5 can be expressed in 9 digits with a coverage of 97% of my range. The second best candidate is base 18 with 94% of the coverage.