I just asked this question about using a regular expression to allow numbers between -90.0 and +90.0. I got some answers on how to implement the regular expression, but most of the answers also mentioned that that would be better handled without using a regular expression or using a regular expression would be overkill. So how do you decide when to use a regular expression and when not to use a regular expression. Is there a check list you can follow?
-
Added c# tag since your previous question was for c#. Hope its okay. – Shoban Nov 04 '10 at 15:13
-
5I don't think that this question is C# specific. – Tim Schmelter Nov 04 '10 at 15:14
-
the original question was C#-implementation specific; this question is non-implementation specific. – Michael Paulukonis Nov 04 '10 at 15:21
-
Yeah I thought about it at first, but thought that this could be applied to any language. – Xaisoft Nov 04 '10 at 15:53
5 Answers
Regular expressions are a text processing tool for character-based tests. More formally, regular expressions are good at handling regular languages and bad at almost anything else.
In practice, this means that regular expressions are not well suited for tasks that require discovering meaning (semantics) in text that goes beyond the character level. This would require a full-blown parser.
In your particular case: recognizing a number in a text is an exercise that regular expressions are good at (decimal numbers can be trivially described using a regular language). This works on the character level.
But doing more advanced stuff with the number that requires knowledge of its numerical value (i.e. its semantics) requires interpretation. Regular expressions are bad at this. So finding a number in text is easy. Finding a number in text that is greater than 11 but smaller than 1004 (or that is divisible by 3) is hard: it requires recognizing the meaning of the number.

- 530,221
- 131
- 937
- 1,214
-
5Ah, thanks for this, so recognizing -90 and +90 is easy, but determining if a number is between -90.0 and +90.0 is more of a challenge. If I am looking just for -90 or 90, then it is just simple text '-90' or '90' that I can easily parse, but if I am looking for numbers in between those, then it becomes more than just processing text. Do I understand that all correctly? That is how I interpreted what you said. – Xaisoft Nov 04 '10 at 16:06
I would say that regex expressions are most effective on Strings. For other data types, manipulations of that data type will usually be more intuitive and provide better results.
For example, if you know that you're dealing with DateTime, then you can use the Parse and TryParse methods will the different formats and it will usually be more reliable than your own regex expressions.
In your example, you are dealing with numbers so deal with them accordingly.
Regex is very powerful, but it isn't the easiest code to read and to debug. When another reliable solution is at hand, you should probably go for that.

- 4,867
- 1
- 32
- 52
-
This is technically correct but incomplete - regex are most effective on strings *which contain regular data*. – Rex M Nov 04 '10 at 15:21
Without meaning to be circular or obtuse, you should use regular expressions when you have a string which contains information structured in a regular language, and you want to turn this string into an object model.

- 142,167
- 33
- 283
- 313
Basic use-case for RegEx :-
You need "Key Value Pairs" - Both Key and Values are embedded within other noisy text - cant be accessed or isolated otherwise.
You need to automate extraction of these values by looping over multiple documents.
Number and combination of Key Value pairs maybe discovered as you progress parsing through text.

- 1,574
- 18
- 25
The answer is straight forward:
If you can solve your problem without regular expressions (just by string functions), you don't use regular expressions. As it was said in one book I've read: regular expressions are violence over computer.
If it's to complicated to use language string functions, use regular expressions.

- 44,202
- 36
- 123
- 164
-
Though practical advice, I dont think this answer is necessarily *complete* advice. This particular question has nothing really to do with string handling beyond the fact that the user is looking at string representations of decimal data. – GrayWizardx Nov 04 '10 at 15:17