1

I wish to use awk to extract data from this table, but I can't get the right output. Each line in the table looks like this:

<tr>
    <td class="center">4
    </td>
    <td>Bergkrystallen via Majorstuen
    </td>
    <td>
    <img src='/Content/img/train2.png'/>
    </td>
    <td>18:55
    </td>
    <td class="center">1</td>
</tr>

I want this: 4 Bergkrystallen via Majorstuen 18:55

I've tried using awk, but I can't get it right:

file.html | awk -F "</?td.*>" '/<\/?td.*>.*/ {print $2 }'

1 Answers1

1

Try:

 awk -F "</?td.*>" '/<\/?td.*>.*/ {printf "%s ",$2 } END {printf "\n"}'

Note - probably only works if your source HTML is consistent...

:)
Dale

Dale_Reagan
  • 1,953
  • 14
  • 11