1

I have this in a file:

<tr class="LightRow Center" style="height:auto;">
<td class="SmallText resultbadB" title="Non-Compliant/Vulnerable/Unpatched" style="width:20%">0</td>
<td class="SmallText resultgoodB" title="Compliant/Non-Vulnerable/Patched" style="width:20%">1</td>
<td class="SmallText errorB" title="Error" style="width:20%">0</td>
<td class="SmallText unknownB" title="Unknown" style="width:20%">0</td>
<td class="SmallText otherB" title="Inventory/Miscellaneous class, or Not Applicable/Not Evaluated result" style="width:20%">0</td>
</tr>
</table>

I am trying to get at the text from this row:

<td class="SmallText resultbadB" title="Non-Compliant/Vulnerable/Unpatched" style="width:20%">0</td>

This is being done a shell script and I am trying to use the bash regular expressions.

I have tried this shell script

#!/bin/bash
set -x
REGEX_EXPR='\<td\ class=\"SmallText\ resultbadB\"\ title=\"Non-Compliant\/Vulnerable\/Unpatched\"\ style=\"width\:20\%\"\>\(.*\)\</td\>'

[[ /tmp/result.html =~ $REGEX_EXPR ]]
echo "output $?"
echo ${BASH_REMATCH[0]}
echo ${BASH_REMATCH[1]}

However I get a no match response (1) on the echo "output $?" I have tried the following regex's as well.

REGEX_EXPR='<td class="SmallText resultbadB" title="Non-Compliant/Vulnerable/Unpatched" style="width:20%">\(.*\)</td>'
REGEX_EXPR='<td class="SmallText resultbadB" title="Non-Compliant/Vulnerable/Unpatched" style="width:20%">(.*)</td>'

And some other escape combinations, example, escaped just the quotes. Tried to define the variable in quotes and so on.

Any thoughts on where I am messing up?

'

Yogesh_D
  • 17,656
  • 10
  • 41
  • 55
  • Seems to be a a problem with the whitespaces in the file. Compare: https://stackoverflow.com/questions/18514135/bash-regular-expression-cant-seem-to-match-s-s-etc – Matthias J. Sax Jul 07 '15 at 09:46

1 Answers1

2

The problem is not in the regex, but in what you try to match it against.

[[ /tmp/result.html =~ $REGEX_EXPR ]]

This means the string /tmp/result.html is being matched, not the contents of the file. To match line by line, you'll need a loop:

while read line ; do
    if [[ "$line" =~ $REGEX ]] ; then 
         ...
    fi
done < /tmp/result.html
choroba
  • 231,213
  • 25
  • 204
  • 289
  • I am like https://www.youtube.com/watch?v=khSIYmTzt6U was so focused on the regex that I didnt pay attention to what I was searching in. – Yogesh_D Jul 07 '15 at 10:06