0

I have the following HTML:

<tr><td><font color="#306eff">P: </font>9283-1000<font color="#306eff">&nbsp;&nbsp;

OR (newline)

<tr><td><font color="#306eff">P: </font>9283-1000

<font color="#306eff">&nbsp;&nbsp;

I went to regexpal.com and entered the following regex:

P: </font>(.*?)<font

And it matches. But when I do it in Java, it doesn't match:

    Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
    Matcher mP = rP.matcher(data);

    if (mP.find()) {
        System.out.println(mP.group(1).trim());
    }

There are multiple regexes I tried on different occasions and they simply don't work in Java. Any suggestions? Thanks!

user2320462
  • 259
  • 3
  • 9

4 Answers4

2

Your works fine for me.

    public static void main(String[] args) {
        String data = "<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
        Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
        Matcher mP = rP.matcher(data);

        if (mP.find()) {
            System.out.println(mP.group(1).trim());
        }
    }

This prints: 9283-1000.

I guess the problem may be in how data is fed into the program.
Because the code itself is OK as you can see from this output.

peter.petrov
  • 38,363
  • 16
  • 94
  • 159
1

Dot does not match newline by default.

Use Pattern rP = Pattern.compile(">P: </font>(.*?)<font", Pattern.DOTALL);

Reference here.

aalku
  • 2,860
  • 2
  • 23
  • 44
0

Try this regex instead:

(?ims).*?>P: </font>(.*?)<font.+

Sample code

public static void main(String[] args) {
    String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
    Pattern rP = Pattern.compile("(?ims).*?>P: </font>(.*?)<font.+");
    Matcher mP = rP.matcher(data);

    if (mP.find()) {
          System.out.println(mP.group(1).trim());
    }
}

Output

9283-1000

Stephan
  • 41,764
  • 65
  • 238
  • 329
0

Try this :

String data="<tr><td><font color=\"#306eff\">P: </font>9283-1000<font color=\"#306eff\">&nbsp;&nbsp;";
Pattern rP = Pattern.compile(">P: </font>(.*?)<font");
Matcher mP = rP.matcher(data);

if (mP.find()) {
      System.out.println(mP.group(1).trim());
}

In java only difference is in escape character .

Sujith PS
  • 4,776
  • 3
  • 34
  • 61
  • sorry, I don't see the difference – user2320462 Jan 09 '14 at 11:21
  • Difference is in data variable . In java string characters must be between "" . In your html code double quotes are there. So you want to escape those charaters using escape character \ . – Sujith PS Jan 09 '14 at 11:23