2

\.((jpg)|(gif)|(jpeg)|(png)|(doc)|(docx)|(pdf)|(zip)|(rar))$ that's the regex i am testing in Regex Hero. test string is Sprite.png Just 2 simple questions

  1. Matches show as 2 Groups why is it so? the test string contains only one png
  2. I used the same expression is a .net Regular expression validator and it doesn't validate correctly. I want the extensions in the groups to be allowed by a file input
Deeptechtons
  • 10,945
  • 27
  • 96
  • 178

3 Answers3

3

Matches show as 2 Groups why is it so?

Because there are two groups. ((png)) is two groups. So is ((jpeg)|(png)).

I used the same expression is a .net Regular expression validator and it doesn't validate correctly.

Try a simpler regex. Grouping each extension separately is entirely pointless.

\.(jpg|gif|jpeg|png|doc|docx|pdf|zip|rar)$

Also think of making the regex case-insensitive, or it won't match upper-case extensions.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • @Tomalak `Also think of making the regex case-insensitive,` do you have any idea how i could do it using the regular expression validator in .net validation controls – Deeptechtons Apr 11 '11 at 12:17
  • @Deeptechtons: Add `(?i)` to the start of the regex. This is called an in-line modifier. http://stackoverflow.com/questions/3542042/how-to-use-inline-modifiers-in-c-regex – Tomalak Apr 11 '11 at 12:23
  • @Tomalak thanks btw could you put a regex to validate both filename and extension. i tried `^[a-zA-Z0-9_]{5,20}+` but it accounts for `.png` in the extension also – Deeptechtons Apr 11 '11 at 12:27
  • 2
    @Deeptechtons: Wouldn't `"(?i)^[a-z0-9_]{5,20}\\.(jpg|gif|jpeg|png|doc|docx|pdf|zip|rar)$"` work? *PS: Tip for the next time: Think about your question and ask it **completely**. Having to add bits and bits of information in detective-Columbo-mode ("Oh, just one more question, sir!") is not very rewarding.* – Tomalak Apr 11 '11 at 12:54
  • @Tomalak you are genius and i didn't want to bug people with another question with the same idea or background. thanks again – Deeptechtons Apr 11 '11 at 13:02
  • @Deeptechtons: But that's not the point! ;-) Just ask a complete question right away next time. It's not a problem if it is complex or consists of several sub-questions, as long as it is specific. It's just frustrating to answer a question more in the comment thread than anywhere else, so try to avoid that in the future. – Tomalak Apr 11 '11 at 14:07
2
  1. Two groups are matching: The large one surrounding the entire alternation, and the smaller one surrounding the literal text png. You could remove the inner ones: \.(jpg|gif|jpeg|png|doc|docx|pdf|zip|rar)$ works just as well.
  2. Try doubling the backslash.
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • How do i qualify the Filename now i tried `^[a-zA-Z0-9_]+$` but it also accounts for `.` in the extension `.png` How do i validate filename to contain alphanumeric and underscores only leaving out the `.` or better said i want to validate both the filename and extension with same validation expression – Deeptechtons Apr 11 '11 at 12:23
1

It has two groups because you have two sets of parenthesis. I've marked them with stars and spaces:

\. *(* (jpg)|(gif)|(jpeg)| *(* png *)* |(doc)|(docx)|(pdf)|(zip)|(rar) *)* $

Both of those groups match. You can make a set of parenthesis into a non-capturing group with (?::

\.(?:(jpg)|(gif)|(jpeg)|(png)|(doc)|(docx)|(pdf)|(zip)|(rar))$

Your regex validates just fine on .NET. However note that in C#, backslashes are special characters inside strings. If you want to use a regular expression backslash you need to escape it:

var re = new Regex("\\.(?:(jpg)|(gif)|(jpeg)|(png)|(doc)|(docx)|(pdf)|(zip)|(rar))$");

Preferably, you should use a verbatim string and avoid the clunky double-escape:

var re = new Regex(@"\.(?:(jpg)|(gif)|(jpeg)|(png)|(doc)|(docx)|(pdf)|(zip)|(rar))$");
R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510