I have a file input.txt
which have loads of weird characters, html tags and useful materials. I want to display 35 characters after the word description
excluding weird characters like $$#$#@$#@***$#
and without html tags in the new file output.txt. Help me.
Thanx in advance.
My final goal is to find the word description and print 35 characters after description which shouldn't include the html tags and weird characters. Is it possible? Like here:
<description><p><img class="float_right"
src="http://static3.businessinsider.com/image/502ab0036bb3f7147b00000f-400-300/dnu.jpg"
border="0" alt="dnu" width="400" height="300" /></p><p>The lawn
was filled with <a class="hidden_link"
href="http://www.businessinsider.com/blackboard/goldman-sachs">Goldman
Sachs</a> Group Inc. partners dressed in pink looking out on a pink sunset.
I want to start from: The lawn is filled with
(again skip those tags and continue from) Group Inc. partners
(35 characters .done!) and then stop and search for another description!