I'm trying to grab data from HTML output that looks like this:
<strong>Target1NoSpaces</strong><span class="creator"> ....
<strong>Target2 With Spaces</strong><span class="creator"> ....
I'm using a pipe train to whittle down the data to the targets I'm trying to hit. Here's my approach so far:
grep "/strong" output.html | awk '{print $1}'
Grep on "/strong" to get the lines with the targets; that works fine.
Pipe to 'awk '{print $1}'. That works in case #1 when the target has no spaces, but fails in case #2 when the target has spaces..only the first word is preserved as below:
<strong>Target1NoSpaces</strong><span
<strong>Target2
Do you have any tips on hitting the target properly, either in my awk or in different command? Anything quick and dirty (grep, awk, sed, perl) would be appreciated.