9

I use regular expressions to validate user input. Now I can configure the regex and so it would help the user to see an example of how a certaint input has to be formatted.

Is it possible to generate some strings that match an arbitrary regex? And is there even an implementation usable somewhere?

UPDATE: Due to the licence I can not use REX. Are there other possiblities?

schoetbi
  • 12,009
  • 10
  • 54
  • 72
  • 1
    I suggest looking into : http://research.microsoft.com/en-us/projects/rex/ , they do such a thing. let me know if it works for you. – Ron.B.I Jul 08 '13 at 09:07
  • 1
    Check out this website: http://debuggex.com. Enter any regex you want then have a look to the `Some random matches` section. You may be surprised with the propositions made when your regex starts to be complicate. – Stephan Jul 08 '13 at 09:16
  • What could be generated by `.*`? – Toto Jul 08 '13 at 09:18
  • Depending on the options everything but line breaks or everything. – Joey Jul 08 '13 at 09:19

4 Answers4

8

Try using this app Rex can do this :)

http://research.microsoft.com/en-us/projects/rex/

For java it's https://code.google.com/p/xeger/

So there are many regex matches generators :)

And this: https://github.com/moodmosaic/Fare

It's xeger wrapper in c#

Kamil Budziewski
  • 22,699
  • 14
  • 85
  • 105
2

Some solutions:

(1) If the regex is written by you (not by the user) and rarely changes, why create anything programmatically? You could just create a few nice examples by hand.

(2) Use a ready-made solution. (see other answers)

(3) Rejection sampling, the sledge hammer solution to all random generation problems: Create a random string and check if it matches the regex. If not, try again. If the regex is very specific, this solution has terrible performance, though.

(4) Implement a parser that transforms a regex into a string construction tree that consists e.g. of the nodes below. Every node has a CreateRandomString method that follows certain rules. Creating a random string means calling that method for the root node.

concatenation: Traverse all child subtrees and concatenate the results in order.

random choice: Select a random child subtree and traverse it. Return the result.

multiplication: Create a random number n between a and b. Traverse the subtree n times and concatenate the results.

leaf: Return a constant string.

Creating the parser is the tricky part :) , especially nested structures. (I have written one for a syntax similar to regexes.)

Sebastian Negraszus
  • 11,915
  • 7
  • 43
  • 70
  • I thoght about number 4 since the regex parser of the .net framework is pure managed. But then I found fare that works quite nice. The regex changes and there are several of them and they are also configurable by the user (not the end user though);) – schoetbi Jul 08 '13 at 18:48
1

as specified in comment, Rex tool will do the trick -

Using Rex to create strings that match your pattern:

run the rex.exe as follows:

rex.exe "your_regex_pattern_here" /k:your_required_examples_num_here

more info regarding this:Rex Guide

Ron.B.I
  • 2,726
  • 1
  • 20
  • 27
0

Almost certainly not, no.

Regular expressions are generally used, in the context you're looking at, to check that a string matches a given format. If you know what your format should be well enough that you're writing a regular expression for it, there should be no reason why you can't generate your own test data easily enough.

[Edit - it appears there are a few examples around. But this does ignore the fact that, to test that your regex is correct, you must have written test data already. So you should already have your strings.]

Adrian Wragg
  • 7,311
  • 3
  • 26
  • 50
  • I think it should be possible to have a class like the regexparser in .net and instead of checking the rules pick an example of the valid characters for each token in the regex and append it to the example string. – schoetbi Jul 08 '13 at 09:08
  • In the situation you describe, you're providing expressions to validate the data against. So surely you know the data already, to have written the expression in the first place? – Adrian Wragg Jul 08 '13 at 09:19
  • No the user has not yet entered the string to match. I like to present him valid examples. – schoetbi Jul 08 '13 at 09:35
  • But how did you write the regular expression, if you didn't already have valid examples already in mind? Why not present him with those? It's your own app, after all. – Adrian Wragg Jul 08 '13 at 09:43
  • I do have it in mind since they are e.g. partnumbers that have to have a certain pattern. The user can pick his own number for his part, but the pattern has to be obeyed – schoetbi Jul 08 '13 at 10:56
  • I'm still not understanding why you are making things more difficult for yourself, when you yourself have the knowledge of the pattern and access to completely valid sample values. You also run the risk of, for example, profanities appearing in your examples. – Adrian Wragg Jul 08 '13 at 11:24