3

I want to generate an example of a valid input by Regex pattern. I'm programming with C# .Net . Like this:

//this emthod doesn't exists, its an example of funcionality that I want.
Regex.GenerateInputExample("^[0-9]{15}$"); 

So, this example gives-me a possible value, like 000000000000000. How to do this?

Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
Only a Curious Mind
  • 2,807
  • 23
  • 39
  • 1
    There is no built-in functionality for this. You should analyze regex pattern and build sample input manually. It might be easy with simple patters like you have shown `^[0-9]{15}$`, but something like this `^(?:(?=.*[a-z])(?:(?=.*[A-Z])(?=.*[\d\W])|(?=.*\W)(?=.*\d))|(?=.*\W)(?=.*[A-Z])(?=.*\d)).{8,}$` will require lot of work. I think building such analyzer and generator makes this question too broad. – Sergey Berezovskiy Jun 11 '14 at 20:59
  • 1
    There are 10^15 possibilities for the above Regex. How would you decide which one to choose? –  Jun 11 '14 at 21:01
  • 1
    According to computation theory, you can write a regular grammar for any regular expression, and regular grammars are producers (while regex are recognizers). Not sure how it would be used tho. – Mephy Jun 11 '14 at 21:04
  • Expanding on @Mephy regex can be converted to push-down automaton etc... which I assume can then be converted into a regular grammar? – Millie Smith Jun 11 '14 at 21:05
  • @KunalB. I can't remember the theory, but I'm guessing non-determinism and randomization in the producer? – Millie Smith Jun 11 '14 at 21:07
  • possibly this is a duplicate of http://stackoverflow.com/questions/3131229/is-it-possible-to-generate-an-example-string-based-on-a-regex-pattern?rq=1 and http://stackoverflow.com/questions/205411/random-string-that-matches-a-regexp – Xantix Jun 11 '14 at 21:42
  • @KunalB.: first one, randomly, the one with the lowest hashcode... It doesn't matter how you decide which to choose. If you choose one of the valid possibilities then you have satisfied the requirement to find an example of a valid input. – Chris Jun 11 '14 at 21:46
  • 1
    Have you taken a look at [Xeger](https://code.google.com/p/xeger/) ? – alex.b Jun 11 '14 at 22:21

1 Answers1

0

So, this problem would take some time to solve, since the functionality is not built in. I'll give a general way to solve it:

Using an ascii (or unicode) chart find out the character codes that correspond to the characters you are using for your regex (65 =A, 69 = D, etc)

Create a random function with those bounds. Multiple bounds would take a little more trickery (A-Z =26, 0-9 = 10, so a random number from 0- 35)


Random random = new Random();
int randomNumber = random.Next(65, 70); // this generates a random number including the bounds of 65-69)

char temp = (char)random;

Next you would take the randomly generated characters and add them together into a string.

        int lowerBound = 65, upperBound =69;
        int length = 6;
        char temp;
        int randomNumber;
        string result= "";

        Random rand = new Random();
        for (int a = 0; a <= length; a++)
        {
            randomNumber = rand.Next(lowerBound, upperBound);
            temp = (char)randomNumber;
            result = result + temp;
        }                  //result is the indirect regex generated string

Indirectly giving you a regex generated string.

The next step is parsing information out of a regex. I've provided a simple case below that will not work for every regex, due to regex complexity.

        Regex bob = new Regex("[A-Z]");

        int lowerBound = Convert.ToInt32(bob.ToString()[1]);
        int upperBound = Convert.ToInt32(bob.ToString()[3]);
        int length = 6; //length of the string to be generated
        char temp;
        int randomNumber;
        string result= "";

        Random rand = new Random();
        for (int a = 0; a <= length; a++)
        {
            randomNumber = rand.Next(lowerBound, upperBound);
            temp = (char)randomNumber;
            result = result + temp;
        }

( This process could be streamlined into class and utilized etc)

  • Thanks for the answer hobble! But the regex in the question is only a example, in another regex I can use only characters, or only special characters, etc.. I dear something that works for everyone – Only a Curious Mind Jun 12 '14 at 11:21