I wrote a little function that will create a random string of a certain length:
def append_until_length(acceptable, length=45):
retval = set()
for _ in range(1000):
retval.add(random.choice(acceptable))
if len(retval) == length:
return ''.join(retval)
This works and everything, so it's all fine and dandy. But while running it I've noticed a sort of pattern if you will:
>>> for _ in range(10):
... append_until_length(acceptable)
...
'!#"%\'(+*-,/.057698=?ADGIHLRUV[]\\`behjmonpryx~'
'"$\')+*,025498:=?ACBGKONQPSY[]\\acdgfhkmruvy{z|'
'#"\'&)+,/03248=<?>ABFHJLOPWYXZ]cbdfhklonqrutz}'
' #"(*-/0328EIJMPSRUWVYX]_^acbegfkmlqpstwvx{}|'
'!#"(,/.032549;=>EDHMLOSYX[]_^acbedjlonprtvxz~'
" %',10346?@CEDFIKNQRVYXZ]\\_abghkjlnqpruw{z}|~"
'! #+,/035469:<@CFIKLSRUVY[Z^cbfijloqsutwvxz}|'
'$&)(+-/5;:?>ABDFIHMLOPSUTYXZa`bdhkjmonprwvx}~'
'!#"&*-/102579:=>@DFKJMLONQSTVYX\\^acimoqpstw}~'
'! &(+-/.2548:=<?A@EGFIKOQPSRTVX\\eihjonprutx}~'
>>>
If you look at this, the first few characters are always punctuation, the next few are always numbers, then comes the uppercase with some mixed punctuation, another punctuation, lowercase letters, and the last characters are always punctuation.
The acceptable characters I'm using are list(string.printable)[:-6]
with a .append(" ")
. The length of this list
is 95:
>>> acceptable
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '-', '.', '/', ':', ';', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`', '{', '|', '}', '~', ' ']
>>> len(acceptable)
95
>>>
Now I understand that the set()
will not allow multiple of the same characters to be in the string, however, that does not explain the pattern always being the same (not really the same but remotely the same). See if I do this via a list
there is never a pattern to the function:
>>> def append_until_length(acceptable, length=45):
... retval = []
... for _ in range(length):
... retval.append(random.choice(acceptable))
... return ''.join(retval)
...
>>> for _ in range(10):
... append_until_length(acceptable)
...
"] *rZI/<=LwPGU-PzWj)\\jp9tZ}e9T#}4/\\R`4Q^?4)'W"
'%z6wTvuzK;{eS}"^GRf(}a3<"Qqg_*2v?1`y@;=Bn#ycQ'
"t'bqj,*}7:w]:8c;Ddy. 17@^Y0{)>}'25tsl1kf+C%6^"
'RZt)s=?~QrAok+Z\\ei}5K^&1e+w0~*zl{hS2;l]|?p/T;'
'%InO5_fWcJU#v,6_=cPb^cfd1=\\;k{37~$214vd+F&oH&'
'!6Ey#"\'3.,ivG+7\'y[&1`aYNDg-\\j#:! -7(8b#$x)Q1m'
'w}/{mnT\\-IT2?;V_K ZDDy:YzaG+LgGkZWkV8E y@_)Y;'
'e1@71AFDF;|Q.<_fRG0tG*`557z(|}bHDCT+dc}{[QGq8'
"ie~;Iy1O)f!n,Z%%0\\36-!Lke1}cA'uptRS7(2ki|mzgi"
'G=v&#.J1@E$N?NK|~>( E4M/^y[~HK)#Hi$23ez~EY>N '
Even if I treat the list
like a set
there is still no pattern to the output strings:
def append_until_length(acceptable, length=45):
retval = []
for _ in range(10000):
char = random.choice(acceptable)
if char not in retval:
retval.append(char)
if len(retval) == length:
return ''.join(retval)
8hKO W5"'ERJa/N$vb9^4!)fig:c_n&?@(#}oTC]qePwZ
,b2;Y^VD9|:O!>QilH`4(7/F?8f&5~_B$x#pN{Igahs\n
_z1eDiH$9k&rRt>M/FOqb8SLY.{|0dI4A^:l,3cs7ng][
Y/iu#eOlVMmZ 9S`t?1JX2$<)&|jUz'"~wLIvoqkr}!(H
r~/m{8SLvU?_aVX4A"0%zEgK1I!9#B|snphOZb,@jw\]2
;nX!T20.^b"\eqNExOlrQF'V&#(%iht{Hw+-Sy,Dj]:9[
B@%H[2f&JuwSd1bEnih#}]3jTMLzAW.ZG~,tX|!/N_`D(
usv}KkZgL]&<hY^6Blp\GENTrFC~Xw3#4S8QmRf"PUnM|
?G3Ao[z7gVLve-}S>X]&<+k(DZ*UcsM50r)^1Om`P4K,6
,#&(1-'sj9qy7~dZpuIk!%Q D8haSNrco{xe;=.T[WK0<
So my question would be, why does the pattern occur with a set
? The uppercase and lowercase characters have a different ord
number, therefore are different characters. IE:
>>> ord("c")
99
>>> ord("C")
67
>>>
So in my head, it doesn't make sense to why there is a pattern in the strings, if they are randomly generated? According to help(set)
:
class set(object)
| set() -> new empty set object
| set(iterable) -> new set object
|
| Build an unordered collection of unique elements.