1

Is there a library that allows me to check for the randomness of an input string? Something like:

 >>> is_random_str("dfgjfgnsdfj9p5230948hfirif") -> returns True
 >>> is_random_str("Hello theree") -> returns False
user2399453
  • 2,930
  • 5
  • 33
  • 60
  • 1
    It really depends on what random distribution you're drawing from. However, if you want to only check for a uniform distribution, you could simply count the character frequencies and check that the variance is minimal – inspectorG4dget Oct 29 '15 at 21:33
  • 1
    I picked that dupe target because it discusses entropy, which is a more appropriate concept for what you're trying to measure, and how it is measured from the input that generated the string and not the generated output. Your example assumes there is some kind of binary True/False measurement which is not possible and implies a sort of [Sorites paradox](https://en.wikipedia.org/wiki/Sorites_paradox). – Two-Bit Alchemist Oct 29 '15 at 21:34

1 Answers1

6

There is a python port of zxcvbn

It will calculate entropy (guessability) but not "randomness". However, to check for "randomness" you'd first have to establish "dictionary-ness" by looking up against a dictionary of (English). Getting fewer matches means it's more random. At the core, there is this logic within zxcvbn to check against a common dictionary of 10,000 words.

There is another strategy which may be simple and what you're after. You could check for "pronounceability" and look for patterns like {{consonant}} {{vowel}} | {{double vowel:au,oo,ou}} {{ consonant }} | {{double consonant:tt,pp,th}} Then for every violation of your defined pattern, you increase the randomness score.

FlavorScape
  • 13,301
  • 12
  • 75
  • 117
  • I don't see why you'd want to bring entropy into this at all. "Word" is equally as non-random (based on "is it valid English") as "This is a test", but has significantly lower entropy. The fact is, you can't determine how random a sentence is based on it's entropy, as entropy puts a value on whether something is guessable based on predefined masks. – Rejected Oct 29 '15 at 21:51