I am looking for a way how to generate unique random (randomly-looking) alphanumeric string with a constraint that each such string has at least two different characters at non-adjacent positions. The length of the string should support at least 20 milion unique values (7 characters should be more than enough).
Example:
AAAAAAA <- first string
AAAAABB <- does not work (different, but adjacent)
ABAAAAA <- does not work (only one different)
AABAABA <- that works perfectly
I was first thinking about using some standard functions (I am using PostgreSql currently, but I can use Oracle as well). I can generate random string using something like md5()
, but I have no idea how to satisfy the other constraint. One idea was to use levenshtein
to check each new generated string against all already generated and accept it only when distance will be greater than X, but this seems like very brute force solution. Moreover, leventshtein
checks only substitution, so the two characters that are different can still be adjacent.
My current solution:
--PostgreSQL 9.5
with t as (
select
generate_series(1, 200) as id,
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1) ||
substr('abcdefghijklmnopqrstuvwxyz0123456789', trunc(random() * 36)::integer + 1, 1)
as rnd_string
)
select distinct id, rnd_string from (
select t1.id, t1.rnd_string, levenshtein(t1.rnd_string, t2.rnd_string)
from t t1
join t t2 on t1.id < t2.id
where levenshtein(t1.rnd_string, t2.rnd_string) > 3
) x
order by id
With 200 IDs it filters only one or two strings from the list, but this will grow with more records.
Related questions: