8

Given a regexp, I would like to generate random data x number of time to test something.

e.g.

>>> print generate_date('\d{2,3}')
13
>>> print generate_date('\d{2,3}')
422

Of course the objective is to do something a bit more complicated than that such as phone numbers and email addresses.

Does something like this exists? If it does, does it exists for Python? If not, any clue/theory I could use to do that?

Bite code
  • 578,959
  • 113
  • 301
  • 329

3 Answers3

8

Pyparsing includes this regex inverter, which returns a generator of all permutations for simple regexes. Here are some of the test cases from that module:

[A-C]{2}\d{2}
@|TH[12]
@(@|TH[12])?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9]))?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9])|OH(1[0-9]?|2[0-9]?|30?|[4-9]))?
(([ECMP]|HA|AK)[SD]|HS)T
[A-CV]{2}
A[cglmrstu]|B[aehikr]?|C[adeflmorsu]?|D[bsy]|E[rsu]|F[emr]?|G[ade]|H[efgos]?|I[nr]?|Kr?|L[airu]|M[dgnot]|N[abdeiop]?|Os?|P[abdmortu]?|R[abefghnu]|S[bcegimnr]?|T[abcehilm]|Uu[bhopqst]|U|V|W|Xe|Yb?|Z[nr]
(a|b)|(x|y)

Edit:

To do your random selection, create a list (once!) of your permutations, and then call random.choice on the list each time you want a random string that matches the regex, something like this (untested):

class RandomString(object):
    def __init__(self, regex):
        self.possible_strings = list(invRegex.invert(regex))
    def random_string(self):
        return random.choice(self.possible_strings)
PaulMcG
  • 62,419
  • 16
  • 94
  • 130
  • Almost what I'm looking for. +1 – Bite code Aug 15 '10 at 14:44
  • I've also packaged this module up as a utility on UtilityMill: http://utilitymill.com/utility/Regex_inverter. All UM utilities expose XML and JSON API's, so you can call this remotely from your own code, and UtilityMill does the regex inversion processing. – PaulMcG Aug 26 '10 at 12:48
  • @PaulMcG: the site requires a username and password upfront. – Dan Dascalescu Aug 09 '17 at 03:27
  • Sorry about that, the site owner was getting spammed and had to throttle down the access. – PaulMcG Aug 09 '17 at 04:12
2

There is a post on the Python mailing list about a module that generates all permutations of a regex. I'm not so sure how you might go about randomising it though. I'll keep checking.

detly
  • 29,332
  • 18
  • 93
  • 152
2

I will probably be flogged for suggesting this, but perl has a module that does exactly this. You might want to take a look at the code how to implement it in python:

http://p3rl.org/String::Random

nicomen
  • 1,183
  • 7
  • 16