0

How to select data between two marker patterns which may occur multiple times with sed?

I've read some related threads here including this one, but I'm still confused with sed's complicated parameters.

My data are some web page source codes, they are total mess and not broken in lines.

For example:

123<div>abc</div><span>DEF</span><div>ghi</div>456

I need to get an output as follow, from the first <div> to the last </div>, how can I do it with sed?:

<div>abc</div><span>DEF</span><div>ghi</div>

Second question: with the result above, how to get the data <span>DEF</span>?

Many thanks:)

Daz81
  • 1
  • While it is OK to use regex for very simple tasks, your solution needs a proper HTML parser and some program / script. – virolino Jul 19 '19 at 07:00

1 Answers1

0

For the particular example you provided, this is the regex:

<div>.*<\/div>

Test it here.

virolino
  • 2,073
  • 5
  • 21