-2

I am trying to find some rates in a table using Regular Expressions that I am reading into a string from HTML. Here is an example:

<td>Euro</td>
<td class='rtRates'><a href='/graph/?from=USD&amp;to=EUR'>0.772199</a></td>
<td class='rtRates'><a href='/graph/?from=EUR&amp;to=USD'>1.295003</a></td>

I am trying to find the numbers contained in the above string. They constantly change so it can't be a hard-coded number search.

I've tried using something similar to this: to=EUR'>(...)

but it only returns the 0.7, not the rest. Any help is appreciated!

EDIT: some code was requested, so here it is

      String re2="to=EUR'>(...)";   // Float 1

    Pattern p = Pattern.compile(re2,Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    Matcher m = p.matcher(webData);

    if (m.find())
    {
        String float1=m.group(1);
        System.out.print("("+float1.toString()+")"+"\n");
    }
Landon
  • 65
  • 1
  • 7
  • Maybe you could show us what you've done so far. Providing a code sample would be a good start. – Clayton Louden Oct 14 '12 at 23:58
  • 1
    @user1745719 you should only post questions about code when you have at least attempted to create it on your own. Everybody is here to help, but not write it for you. – Eric Leroy Oct 15 '12 at 00:04
  • Attempting to parse HTML using a regex is *doing it wrong* in almost every case. Use an HTML parser - related: http://stackoverflow.com/questions/3152138/what-are-the-pros-and-cons-of-the-leading-java-html-parsers – Brian Roach Oct 15 '12 at 00:08
  • @Eric Leroy. I have been attempting to do this, which is why I said I can't isolate the number and it only returns 0.7.. – Landon Oct 15 '12 at 00:11
  • 1
    @BrianRoach. Thank you for the link, unfortunately I have to use Regex this time. I do like the HTML parser approach better though – Landon Oct 15 '12 at 00:12
  • 1
    Okay, but the regex code should have been included. Even if it doesn't work so people can explain what you did wrong. Right now, there is no way to tell. Please think about editing your post right now, and adding the java regex into it. Re word things, instead of I need to, say I'm trying to. Things like that make a big difference. I was getting marked down on all my posts too until I learned. It really helps to ready the http://stackoverflow.com/faq – Eric Leroy Oct 15 '12 at 00:15
  • It's a great improvement over the original! – Eric Leroy Oct 15 '12 at 00:16
  • Thanks Eric, I will keep all this in mind for the future :) – Landon Oct 15 '12 at 00:17

3 Answers3

4

You can use this expression for quick and dirty searches:

EUR'>([^<]*)<

This is not ideal, though: using an HTML or an XHTML parser is a much better solution, because it is much more powerful and robust than any regex-based solution.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

To match just numbers, use positive lookarounds:

(?<=EUR'>)\\d+(?:\\.\\d*)?(?=<)
(?<=USD'>)\\d+(?:\\.\\d*)?(?=<)
Ωmega
  • 42,614
  • 34
  • 134
  • 203
1

OK, not what you asked for, but I wanted to point out that when both sides of the string you are looking for are fixed like that, you can use the substring() and indexOf() methods which can often be simpler to debug:

public class substring_not_regex {

   public static void main(String args[])
   {
      String test= "<td class='rtRates'><a href='/graph/?from=EUR&amp;to=USD'>1.295003</a></td>";     
      String result = getConversion(test,"to=USD'>");
      System.out.println("The result is: " + result);
      test= "<td class='rtRates'><a href='/graph/?from=USD&amp;to=EUR'>0.772199</a></td>";
;     result = getConversion(test,"to=EUR'>");
      System.out.println("The result is: " + result);
   }

   static String getConversion(String tableLine,String toSearchFor)
   {
      String value = "";
      String aref_terminator = "</a>";
      int position = tableLine.indexOf(toSearchFor);
      if ( position == -1 ) return value;
      int start_position = position + toSearchFor.length();
      int end_position = tableLine.indexOf(aref_terminator,start_position);
      if ( end_position == -1 ) return value;
      value = tableLine.substring(start_position,end_position);
      return value;
   }
}

outputs:

The result is: 1.295003
The result is: 0.772199
Scooter
  • 6,802
  • 8
  • 41
  • 64