0

I'm somewhat at ease with regex, but not with grep particularly, and can't figure out why the following regex returns nothing:

wget -qO- 'http://www.acme.com/index.html' | grep -iPo '(?s)(^<div class="titlebar">.+?<div class="colleft">)'

I prepended (?s) because the catch-all ".+?" includes carriage-returns (either CRLF, CR, or LF, depending on how the text was saved).

Any idea why it doesn't work as expected?

Thank you.

Gulbahar
  • 5,343
  • 20
  • 70
  • 93

1 Answers1

0

grep is line-oriented, so if there are newlines between the tags, grep can't find it. You'll want:

wget -qO- 'http://website.invalid/index.html' |
perl -0777 -nE 'say for /(^<div class="titlebar">.+?<div class="colleft">)/msg'
glenn jackman
  • 238,783
  • 38
  • 220
  • 352