In Linux, execute the following command to download the "First Monday" article:
wget -O first_monday.html http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3156/2747
Use sed and regular expressions to edit first_monday.html as follows:
Remove empty/blank paragraphs, if any. (HTML paragraph starting tag is <p>
and ending tag is </p>
)
<p>This is some text in a paragraph.</p>
A paragraph is empty if there is nothing or has only spaces or tabs in between <p>
and </p>
Remove all images (In HTML, images are defined with the <img>
tag. Example:
<img src="html5.gif" alt="The official HTML5 Icon">
The resulting file should still be a valid HTML file, displayable in a standard web browser. For your answer, copy/paste the commands you used to answer this question. For example, if you used a command similar to
sed -iback -e 's|<p>[[:space:]]*</p>||g' first_monday.html
then you would paste that command as well as any others you used in the answer for this field.
[[:space:]]*
||g' first_monday.html but saw no difference – sa044512 Nov 03 '15 at 17:01