I'm trying to make a Bash script to extract results from an HTML page. I achieved to get the content of the page with Curl, but the next step is parsing the output, which is problematic.
The interesting content of the page looks like this:
<div class="result">
...
<div class="item">
<div class="item_title">ITEM 1</div>
</div>
...
<div class="item_desc">
ITEM DESCRIPTION 1
</div>
...
</div>
<div class="result">
...
<div class="item">
<div class="item_title">ITEM 2</div>
</div>
...
<div class="item_desc">
ITEM DESCRIPTION 2
</div>
...
</div>
I'd like to output something like:
ITEM1;ITEM DESCRIPTION 1
ITEM2;ITEM DESCRIPTION 2
I know a bit of Grep, but I can't wrap my mind about making it to work here, also some people told me to use Awk, which seems best suited for this kind of task.
I'd appreciate any help.
Thank you very much.