2

If I have a bunch of data across multiple lines, how do I make it non greedy? What I have is greedy.

example data

</TD> 
<TD CLASS='statusEven'><TABLE BORDER=0 WIDTH='100%' CELLSPACING=0 CELLPADDING=0><TR><TD             ALIGN=LEFT><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0> 
<TR> 
<TD ALIGN=LEFT valign=center CLASS='statusEven'><A HREF='extinfo.cgi?    type=2&host=localhost&service=Current+Load'>Current Load</A></TD></TR> 
</TABLE> 
</TD> 
<TD ALIGN=RIGHT CLASS='statusEven'> 
<TABLE BORDER=0 cellspacing=0 cellpadding=0> 
<TR> 
</TR> 
</TABLE> 
</TD> 
</TR></TABLE></TD> 
<TD CLASS='statusOK'>OK</TD> 
<TD CLASS='statusEven' nowrap>08-04-2011 22:07:00</TD> 
<TD CLASS='statusEven' nowrap>28d 13h 18m 11s</TD> 
<TD CLASS='statusEven'>1/1</TD> 
<TD CLASS='statusEven' valign='center'>OK &#45; load average&#58; 0&#46;01&#44; 0&#46;04&#44; 0&#46;05&nbsp;</TD> 

Here's my code so far

Pattern p = Pattern.compile("(?s)<TD ALIGN=LEFT valign=center CLASS(.*)?<TABLE");
Matcher m = p.matcher(this.resultHTML);

if(m.find())
{
     return m.group(1);
}
AndersTornkvist
  • 2,610
  • 20
  • 38
kireol
  • 703
  • 1
  • 9
  • 22
  • 4
    If I may advice, DO NOT parse HTML with regex. It won't work next week. Use HTML parser, like Neko. Or HTMLUnit. – Ondra Žižka Aug 05 '11 at 02:43
  • 2
    you might want to read the reply in this thread, its funny and true http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – KaKa Aug 05 '11 at 02:47

2 Answers2

3

Ungreedy:

Pattern.compile("(?s)<TD ALIGN=LEFT valign=center CLASS(.*?)?<TABLE");

Also, check this:

Java Regexp: UNGREEDY flag

I've implemented UNGREEDY for JDK's regex.

Community
  • 1
  • 1
Ondra Žižka
  • 43,948
  • 41
  • 217
  • 277
2

To make a quantifier non-greedy, you add a question mark immediately after it:

.*    // greedy

.*?   // non-greedy

What you've got there - (.*)? - is a greedy .* in a capturing group, said group being optional (the ? is serving in its original role, as a zero-or-one quantifier).

Alan Moore
  • 73,866
  • 12
  • 100
  • 156