I have an html file which I basically try to remove first occurences of <...>
with sub
/gsub
functionalities.
I used awk regex .
*
+
according to match anything between <
>
. However first occurence of >
is being escaped (?). I don't know if there is a workaround.
sample input file.txt
(x
is added not to print empty):
<div>fruit</div></td>x
<span>banana</span>x
<br/>apple</td>x
code:
awk '{gsub(/^<.*>/,""); print}' file.txt
current output:
x
x
x
desired output:
fruit</div></td>x
banana</span>x
apple</td>x