0

I have wrote following RegEx to detect all ocurrences of C. 1909:

input: C. 1909 test C.1909

\b[Cc][\.]\s*?\d+\b

this works fine,

However when I try to detect all ocurrences of 1909 C. using following regex, it does not match anything:

input: 1909C. test 1909 C.

\b\d+\s*?[Cc][\.]\b
Nitin Sawant
  • 7,278
  • 9
  • 52
  • 98
  • 1
    `\s*?` what are you trying here? `\s*` means "space/tab repeated 0 or more times" and `?` means "repeated 0 or 1 times". – h2ooooooo Jul 04 '13 at 08:20
  • 1
    If you want 4 digits, use `\d{4}`. Replace `x*?` with `x*` where `x` is any regular expression. – Ingo Jul 04 '13 at 08:23
  • 2
    @h2ooooooo - That's a ["lazy" quantifier](http://stackoverflow.com/q/3075130/7586) - it isn't wrong, but it is useless in this context. – Kobi Jul 04 '13 at 08:23
  • @h2ooooooo That means "non-greedy matching" (that is, match as less as possible). Default behavior for * is greedy matching. This is probably due to some requirements not included in the question (I hope...). – m0skit0 Jul 04 '13 at 08:31
  • @h2ooooooo: `\s*?` It can be separated by `0 or more spaces`, @Ingo: `\d+` It can contain `1 or more digits` – Nitin Sawant Jul 04 '13 at 08:32

2 Answers2

4

. is not a word character, so \b after it would only match if it's followed by a word character. Instead of the last \b you could use (?!\w), (?!\S), \B, or even remove it if you aren't picky.

Qtax
  • 33,241
  • 9
  • 83
  • 121
0

remove the boundary condition '\b' it will work.

Jegan
  • 1,227
  • 9
  • 16