3

I've spent like one hour trying to make a better regex but that's not my cup of tea... I need a regex which will do the following (can provide more if needed):

Spd_Engine          #Ok
speedengine         #Ok
enginespd           #Ok
Engine_speed        #Ok
aps_speed_engine    #Ok
engine_speed        #Ok
engine_trq          #Not Ok
speed_rpm           #Not Ok

The regex shoud match every line which contains at least (engine && (speed || spd))

So I came up with this:

[e,E]ngine[_]?[s,S]p[e]*d|[a-zA-Z]*[_]*[s,S]p[e]*d[_]?[e,E]ngine

But I feel it can be improved. How can I simplify it?

Sebastian 506563
  • 6,980
  • 3
  • 31
  • 56
Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
  • For convenience, could you please edit your question with a description in words of what your validation check is? It's not obvious why aps_speed_engine is acceptable but engine_trq isn't. – shree.pat18 Dec 05 '14 at 09:46
  • 1
    Inside character class you need not use commas, `[eE]` the correct format. But you can use `i` flag to ignore the case checking – nu11p01n73R Dec 05 '14 at 09:48
  • 3
    This isn't code review. "Improve" is opinion-based. RegEx is overkill, `input = input.ToLower(); if ((input.Contains("speed") || input.Contains("spd") && input.Contains("engine")) { ... }` will do the same and is more readable. – CodeCaster Dec 05 '14 at 09:51
  • @nu11p01n73R I didn't manage to find the correct syntax for the `i` flag using [regex hero](http://regexhero.net/) – Thomas Ayoub Dec 05 '14 at 09:51
  • @CodeCaster Sorry, as this is not the first question about reviewing regex in SO I took my chance – Thomas Ayoub Dec 05 '14 at 09:52
  • I'm not a fan of regex an I bet your maintenance programmers wont be either. I'm with codecaster on this – Kell Dec 05 '14 at 09:56
  • 1
    You can start by asking a more specific question. "I feel" and "simplify" aren't very factual. _"How can I prevent the repetition of the words I'm looking for regardless of their order?"_ is a more precise question. – CodeCaster Dec 05 '14 at 09:56

2 Answers2

9

You can use look aheads to simplify the regex a lot as

^(?=.*spe*d)(?=.*engine).*
  • ^ Anchors the regex at the start of the string

  • (?=.*spe*d) positive look ahead. Checks if the string contains spe*d

  • (?=.*engine) another postive look ahead. Checks if the string contains engine

  • .* matches the entire string

Regex Example

OR

^(?=.*spe*d).*engine.*

Droping the second look ahead

Regex Example

Notes on [e,E]ngine[_]?[s,S]p[e]*d|[a-zA-Z]*[_]*[s,S]p[e]*d[_]?[e,E]ngine

  • [e,E] the commas inside character class does not mean e comma E. You can modify it as

    [eE]

  • [_]? There is no advantage in adding a single character in character class. It is simlar as wirting _?

  • i flag i can be used to ignore cases while matching the regex

nu11p01n73R
  • 26,397
  • 3
  • 39
  • 52
0

So if shorter version of this regex is count as simplification then it is what came to my mind

[e,E]ngine_?[s,S]ped|[a-zA-Z]_[s,S]ped_?[e,E]ngine

VS

[e,E]ngine[]?[s,S]p[e]*d|[a-zA-Z]*[]*[s,S]p[e]*d[_]?[e,E]ngine

Sebastian 506563
  • 6,980
  • 3
  • 31
  • 56