0

I have a bunch of strings that may contains certain patterns. Specifically, the following 3.

  1. Starts with (- followed by 10 digits followed by ).

    E.g.:

    (-1234567890)

  2. Starts with (, ends with ), and may contain 1 or more characters, but NO spaces.

    E.g.:

    (ABC) or (AF33) or (2345)

  3. Starts with (, ends with ), and may contain 1 or more characters, INCLUDING spaces.

    E.g.:

    (Some string)

The strings I work with may contain zero or more of the patterns above. My requirement is to match ONLY the second one from above in a given string, and I'd like to be able to use Regex class in C#.

For example, let's say following are five different strings I have.

This is some random text.

This is some (ABC) random (-1234567890) text.

This is some (XY12) random (-1234567890) text.

This is some (Contains space) random (-1234567890) text.

This is some () random text.

My Regex should match only the 2nd and 3rd strings from the above list.

So far, I've managed to write this following Regex, which excludes strings 1 and 5.

.*\((?!\-).+\).*

This matches 2nd, 3rd, AND 4th strings above. Now I'm not sure how I can get it to exclude the 4th, one which contains spaces inside parenthesis. I know that \S detects whitespaces, but how can I tell it to detect strings that do not contain spaces only within the parenthesis that don't contain a - after the first (?

EDIT 1:

There will never be nested parenthesis in my strings.

EDIT 2:

Here's a Regex Tester.

Sach
  • 10,091
  • 8
  • 47
  • 84
  • whats your char set? is it ascii only? is it possible to have another ( inside ()? like (\\() – Steve Apr 27 '18 at 21:50
  • Both ASCII and UTF8 would be preferable, but I'd be happy even with just ASCII if it's too difficult to incorporate both. – Sach Apr 27 '18 at 21:52
  • And no, there will never be nested parenthesis. – Sach Apr 27 '18 at 21:53
  • 1
    Something simple like [`.*?\(\w+\).*`](https://regex101.com/r/FKV8fl/1) would be sufficient, wouldn't it. – bobble bubble Apr 27 '18 at 22:04
  • I think you need [`@"^[^()]*(?:\(([^\s()]+)\)[^()]*)+$"`](http://regexstorm.net/tester?p=%5e%5b%5e%28%29%5cn%5d*%28%3f%3a%5c%28%28%5b%5e%5cs%28%29%5d%2b%29%5c%29%5b%5e%28%29%5cn%5d*%29%2b%5cr%3f%24&i=This+is+some+random+text.%0d%0aThis+is+some+%28ABC%29+random+%28-1234567890%29+text.%0d%0aThis+is+some+%28XY12%29+random+%28-1234567890%29+text.%0d%0aThis+is+some+%28Contains+space%29+random+%28-1234567890%29+text.%0d%0aThis+is+some+%28%29+random+text.&o=m). If you need a whole string that contains the pattern, use `if (Regex.IsMatch(s, pattern)) { /*do something with s*/ }`. – Wiktor Stribiżew Apr 27 '18 at 22:36

2 Answers2

3
.*\(\w+\).*

If you use above regex, second and third strings are matches only

.* all characters

( pharantesis

\w+ all word characters (at least one)

) pharantesis

.* all characters

Adem Catamak
  • 1,987
  • 2
  • 17
  • 25
  • That does seem to work, thanks! Can you please explain it, maybe breaking down? Especially the `(\w|\d)+` part? – Sach Apr 27 '18 at 22:00
  • 1
    `\w` will match any 'word character' (https://stackoverflow.com/questions/2998519/net-regex-what-is-the-word-character-w). `\d` will match any numeric digit. `(\w|\d)` will match a word character or a digit. `(\w|\d)+` will match one or more of a word character or a digit. After my edit I wonder if just `(\w)+` would work for that part since it seems that `\w` also matches digits. – Tony Tuttle Apr 27 '18 at 22:04
  • @huck_cussler yes you right. I also notice that and fix my answer. – Adem Catamak Apr 27 '18 at 22:09
1
\(([^- ]+[^ ]*)\)

should work

Explanation:

[^- ]+ will first match one character that's neither - or This will make sure it contains at least one character

Then [^ ]* will match 0 or more none white space characters

This will work for any char set

Steve
  • 11,696
  • 7
  • 43
  • 81