0

Is it possible to generate regular expressions from a user entered string? Are there any C# libraries to do this?

For example a user enters a string e.g. ABCxyz123 and the C# code automatically generates [A-Z]{3}[a-z]{3}\d{3}. This is a simple string but we could have more complicated strings like

MON-0123/AB/5678-abc 2/7

Or

1234-678/abc::1234ABC?246

I already have a string tokeniser (from a previous stackoverflow question) so I could construct a regex from the list of tokens.

But I was wondering if there is a lib or C# code out there that’ll do it.

Edit: Important, I should of also said: It's not the actual character in the string that are important but the type of character and how many.

e.g A user could enter a "pattern" string of ABCxyz123.

This would be interpreted as

3 upper case alphas followed by

3 lower case alphas followed by

3 digits

So other users (when complied) must enter strings that match that pattern [A-Z]{3}[a-z]{3}\d{3}., e.g. QAZplm789

It's the format of user entered strings that's need to be checked not the actual content if that makes sense

Rory
  • 1,442
  • 4
  • 21
  • 38
  • How do you distinguish if `ABCxyc` should resolve to `A-Za-z` or `A-Cx-z`? – MakePeaceGreatAgain Nov 14 '14 at 10:34
  • Something like this would be stupid to do since it wont work because [A-Z]{3}[a-z]{3}\d{3} could be a vastly difrent string then the one your entered you would need to create many rules for this and i see no point in doing this since you can just use regex itself – Vajura Nov 14 '14 at 10:43
  • Think you've misread my question – Rory Nov 14 '14 at 10:48
  • 2
    I agree with other posters, this seems like a really strange solution, which got me curious... Could you elaborate a little more on what problem you're trying to solve? – Daniel Persson Nov 14 '14 at 10:51
  • Upvoted because this is an interesting question which provides use case examples. This may be better suited for programmers.SE though, its a broad design question rather than a specific implementation. – Freiheit Nov 14 '14 at 10:52
  • "So other users (when complied) must enter strings that match that pattern [A-Z]{3}[a-z]{3}\d{3}., e.g. QAZplm789 It's the format of user entered strings that's need to be checked not the actual content if that makes sense" - This seems key. Are you trying to allow one user (say an admin) to define an input mask for fields which will be entered by other users? – Freiheit Nov 14 '14 at 10:54
  • [Related](http://stackoverflow.com/q/16499142/1578604) – Jerry Nov 14 '14 at 10:55
  • Yes, a template is set up by an admin. There could be any number of fields in one document and a different set of fields in another document. – Rory Nov 14 '14 at 10:57

1 Answers1

1

Jerry has a related link creating a regular expression for a list of strings

There are a few other links off this.

I'm not trying to do anything complicated e.g NLP etc.

I could use C# expression builder and dynamic linq at a push, but that seems overkill and a code maintainable nightmare .

I'll write my own "simple" regex builder from the tokenized string.

Example Use Case:

An admin office user where I work could setup the string patterns for each field by typing a string pattern, My code converts this to a regex, I store these in a database.

E.g: Field one requires 3 digits at the start. If there are 2 digits then send to workflow 1 if 3 then send to workflow 2. I could simply check the number of chars by substr or what ever. But this would be a concrete solution. I am trying to do this generically for multiple documents with multiple fields. Also, each field could have multiple format checkers.

I don't want to write specific C# checks for every single field in numerous documents.

I'll get on with it, should keep me amused for a couple of days.

Community
  • 1
  • 1
Rory
  • 1,442
  • 4
  • 21
  • 38