0

I know that ^ and $ means "matches the beginning of the line" and "matches the end of line" However, when I did some coding today, I didn't notice any difference between including them and excluding them in a regular expression used in Java.

For example, I want to match a positive Integer using

^[1-9]\\d*$

, and when I exclude them in the regular expression like

[1-9]\\d*

, it seems that there is no difference. I have tried to test with a String that "contains" an integer like @@@123@@@, and the second regular expression can still recognize it is not valid like the first one.

So are the two regular expressions above completely equal to the other one? Thanks!

roland luo
  • 1,561
  • 4
  • 19
  • 24
  • There *is* a difference if the regular expression matcher is *not* anchored to begin with: show the code that *uses* the regular expression. (Also the behavior of `^` and `$` are not strictly line-based, see [Pattern.MULTILINE](http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#MULTILINE).) – user2864740 Feb 23 '14 at 20:48
  • which method to you use? Is it match or find? – Leos Literak Feb 23 '14 at 20:49
  • 1
    If you have used the `String#matches` method, then you are matching the entire string, irrespective of the `^` and `$`. – Chthonic Project Feb 23 '14 at 20:52

3 Answers3

0

Do you need to search a string like 2343, or [SPACE]2345, or abc234?

The anchored regex will only find the number in the first string. The un-anchored will find them in all strings.

It all depends on what your requirements are. Are you analyzing lines in a text file, where each line contains only digits?, or are you analyzing the text in a prose document or source-code, where digits may be interspersed among a whole bunch of other stuff?

In the former case, the anchors are good. In the latter, they are bad.

More info: http://www.regular-expressions.info/anchors.html

aliteralmind
  • 19,847
  • 17
  • 77
  • 108
0

They are different, the first input checks the whole line so from the begin to the end of the line and second doesn't care about the line.

For more check: regex-bounds

Edwin
  • 2,146
  • 20
  • 26
0

Well...no, the regular expressions aren't equivalent. They're also not doing what you think they are.

You intend to match a positive digit - what your regular expression aims to do is to match some character between 1 and 9, then match any number of digit characters after that (which includes zero).

The difference between the two is the anchoring, as you've noted - the first regex will only match values that literally begin with a 1 through 9, then zero or more digits, then expect there to be nothing else in the string.

The correct regex to match any positive number anywhere in the string would look like this:

[1-9]*\\d*

...and the correct regex to match any line that is a positive number would be this:

^[1-9]*\\d*$
Makoto
  • 104,088
  • 27
  • 192
  • 230