1

How to find shortest string, first occurance it should return

I have this string. I m looking for td whose value contains blabla with closing td. For ex:

  <tr blabla><td>blabla big content</td></tr><tr><td>thisisnot</td></tr>

I want only this string

  <tr blabla><td>blabla big content</td></tr>

I m using this regex in .net

<tr.*><td>blabla.*</td></tr>

I m new to regex...

Can any one tell me the way out.

Nag
  • 11
  • 2
  • 10
    Obligatory [you do not parse HTML with a regex](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) link. – CanSpice May 19 '11 at 21:31
  • 3
    No it is a very small html content which i m trying to parse. Just treat it as plain text instead of html. – Nag May 19 '11 at 21:34
  • 2
    In the immortal words of Mike Holmes, "If you're going to do something, do it right the first time." Don't use a regex to parse HTML, even if it's small, because these things never stay small. Use an HTML parser. – CanSpice May 19 '11 at 21:36

2 Answers2

6

Regex is by nature greedy - it will try and match the longest string that satisfies the pattern.

You need to use non-greedy quantifier in your pattern. So instead of "*" use "*?", and then use groupings to "capture" the match. The anonymous capturing of items is done by enclosing the group you want to capture in a set of parenthesis. The following seems to do the trick:

(<tr.*?><td>blabla.*?</td></tr>).*

This will create a capture group that you will need to query the regex result for.

James
  • 4,644
  • 5
  • 37
  • 48
Xhalent
  • 3,914
  • 22
  • 21
0

Use (?<=<td>)[^<]+ as the regex, then do a length comparison on the matches.

daalbert
  • 1,465
  • 9
  • 7