0

Referring to this question here. I have a new Problem. I want to extract the last two appearing numbers (longitude/latitude) from a String.

I use the approach from the referenced answer. A simple example where the error occures can be found here:

public static void main(String[] args) {
    String input = "St. Louis (38.6389, -90.342)"; 

    Pattern p = Pattern.compile("-?[\\d\\.]+");
    Matcher m = p.matcher(input);

    while(m.find()){
        System.out.println(m.group());
    }
}

Console output is as following:

.
38.6389
-90.342

The Problem appears to be the "." in "St. Louis". Can someone help me to solve this issue in a nice way?

Thanks a lot for every answer/comment.

Community
  • 1
  • 1
Waylander
  • 825
  • 2
  • 12
  • 34

3 Answers3

3

Change your regex like below,

Pattern p = Pattern.compile("-?\\d+(?:\\.\\d+)?");

(?:\\.\\d+)? at the last would make the decimal part as optional.

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
3
[-+]?\\d+(\\.\\d+)?

Use this for floating numbers and integers.

The Problem appears to be the "." in "St. Louis"

Thats because of -?[\\d\\.]+ [] the character class.It can match any character defined inside class.

Community
  • 1
  • 1
vks
  • 67,027
  • 10
  • 91
  • 124
  • This fails to catch `.9`, which is also a common way to write 0.9, Including in java's syntax – amit Mar 02 '15 at 09:49
  • @amit i guess that should be clarified with OP if he wants it or not – vks Mar 02 '15 at 09:51
  • Why wouldn't he? It's a legitimate writing, maybe the OP should also explicitly clarify if he wants to include the digit 5 as well? – amit Mar 02 '15 at 09:52
  • @amit because these are `longitudes and latitudes`.i m not very sure there these numbers can appear or not – vks Mar 02 '15 at 09:53
  • If 0.9 can appear, so does .9 – amit Mar 02 '15 at 09:53
0

A regex like -?\d*(?:\.\d+)? causes the engine to find zero-width matches everywhere. A better approach is to simply use alternation to require that either there is a dot with numbers or one or more numbers after which a dot with more numbers is optional:

-?(?:\.\d+|\d+(?:\.\d+)?)

Regular expression visualization

Debuggex Demo

In Java you need to escape this like so:

Pattern p = Pattern.compile("-?(?:\\.\\d+|\\d+(?:\\.\\d+)?)");

Alternatively, you could add a lookahead that requires that there is at least one number somewhere:

-?(?=\.?\d)\d*(?:\.\d+)?

Regular expression visualization

Debuggex Demo

asontu
  • 4,548
  • 1
  • 21
  • 29