0

Possible Duplicate:
how to check if a string looks randomized, or human generated and pronouncable?
Is there any way to detect strings like putjbtghguhjjjanika?

Is there any algorithm that is able to detect how random a online nickname is? There are many situations where it would come in useful.

Given any alphanumeric name, the algorithm should be able to give it a "randomness" value. If the randomness value is too high, the application could then force the user to choose another name.
For example, "Mikel" would pass the test and be allowed to be used, while "Agslj" would not and be forced to choose another name

If there isn't an algorithm already available, how would I be able to create an algorithm for this?

Community
  • 1
  • 1
小太郎
  • 5,510
  • 6
  • 37
  • 48
  • 1
    I guess [1337](http://en.wikipedia.org/wiki/Leet)-speek will make this very hard. – Filburt Apr 16 '12 at 12:20
  • 2
    I wouldn't bother. This can only upset the users. Would stick to the CAPTCHA and usual stuff... – Anonymous Apr 16 '12 at 12:23
  • One of the rules for the application I'm designing this for is that names must not used leetspeak, is English, is human readable, and generally literate. Abbreviations are also not allowed. – 小太郎 Apr 16 '12 at 12:26
  • 1
    What do you mean by 'names must be English'? Names aren't words in a language. "Mikel" isn't a typical English name. If someone's name is "Agslj", why shouldn't they be allowed to use that as their nickname? – AakashM Apr 16 '12 at 12:32
  • "Names must be English" as in the name must be English-like, readable and easily pronounceable by English speakers. "Mikel" might not be an English name, but it could be considered English, while "Agslj" doesn't look like it could be English at all. – 小太郎 Apr 16 '12 at 12:37
  • @OP This is still going to annoy your users. However, if you want to know if this is pronounceable, that is a completely different issue from "randomness". – Marcin Apr 16 '12 at 12:48
  • "doesn't look like it could be English at all" - again, *names* are not *words in a language*. If you can't explain in unambiguous terms what you want to do, you're not going to have much luck getting a computer to do it for you. And that's before we even get to 'abbreviations are not allowed'... no "Lisa"? no "Becky"? no "Alex"? – AakashM Apr 16 '12 at 12:58
  • Uh sorry, I worded that wrong. I meant acronyms – 小太郎 Apr 16 '12 at 13:26
  • 1
    I wrote the solution to the "detecting strings like putjbtghguhjjjanika" post, and I think it will work reasonably well for this problem. – Rob Neuhaus Apr 16 '12 at 22:06

2 Answers2

2

You would probably want to look at the usage pattern / frequency of English letters, Letter Frequency Analysis is a good website for the very basics.
This page will also help you in your hunt.

Basically what you want to do is use similar techniques to what cryptographers use when trying to crack codes.
If the work entered into the text box matches the known usage pattern within a lose threshold then you can allow it, whereas if the entry does not match any frequency / usage pattern even a little bit then you can safely discard it.

I would defiantly recommend you looking into such techniques before attempting any algorithm of this sort.

However on such short input I cannot guarantee your accuracy...

Serdalis
  • 10,296
  • 2
  • 38
  • 58
  • 2
    I wouldn't guarantee anything on such short input. A word like "crypt" might throw it way off, whereas a word like 'queue' will throw it off in the opposite direction. – Paragon Apr 16 '12 at 15:34
  • Indeed, the usual use of this technique is on big lines of text, but I thought this would point him in an at least plausible direction =d. – Serdalis Apr 16 '12 at 23:24
-1

It seems hard to implement an algorithm like this but you could try something :

If you check english based words (or any Latin derived language), there isn't many words with more than 3 continuous consonants. The same for vowels.

Also, you can count how many numbers are present, at which place, etc.

noli
  • 3,535
  • 3
  • 18
  • 20