Here's a GNU grep
solution that uses -P
to activate support for PCREs (Perl-Compatible Regular Expressions):
grep -Po '"cur_wind">\K[^<]+' \
<<<'<span class="cur_wind">with 3km/h SSW winds</span><hr class="hr_sm" /></td>'
-o
specifies that only the matching string be output
\K
is a PCRE-feature that drops everything matched so far; this allows providing context for more specific matching without including that context in the match.
Another option is to use a look-behind assertion in lieu of \K
:
grep -Po '(?<="cur_wind">)[^<]+' \
<<<'<span class="cur_wind">with 3km/h SSW winds</span><hr class="hr_sm" /></td>'
Of course, this kind of matching relies on the specific formatting of the input string (whitespace, single- vs. double-quoting, ordering of attributes, ... - in addition to the fundamental problem of grep
not understanding the structure of the data) and is thus fragile.
Thus, in general, as others have noted, grep
is the wrong tool for the job.
On OSX, assuming the input is XML (or XHTML), you can parse robustly with the stock xmllint
utility and an XPath expression:
xmllint --xpath '//span[@class="cur_wind"]/text()' - <<<\
'<td><span class="cur_wind">with 3km/h SSW winds</span><hr class="hr_sm" /></td>'
Here's a similar solution using a third-party utility, the multi-platform web-scraping utility xidel (which handles both HTML and XML):
xidel -q -e '//span[@class="cur_wind"]' - <<<\
'<td><span class="cur_wind">with 3km/h SSW winds</span><hr class="hr_sm" /></td>'