This is stumping me. I have a string which is a verbose piece of XHTML:
irb(main):012:0> input = <<-END
irb(main):013:0" <p><span class=\"caps\">ICES</span> evaluated the management plan in 2009
and found it to be in accordance with the PA. However, the <span class=\"caps\">SSB</span> index , being based on lengths, excludes the problem connected with age estimation.</p>\n<p><span class=\"caps\">SSB</span>
index is estimated to have decreased by more than 20% between the periods 2010–2012
(average of the three years) and 2013–2014 (average of the two years).</p>\n<p>A candidate
multispecies F<sub><span class=\"caps\">MSY</span></sub> was estimated.</p><pre><code><p>
The management plan, agreed October 2007 and implemented January 2008 was evaluated by
<span class=\"caps\">ICES</span> as to its accordance with the precautionary approach and
reviewed by three independent scientists.</p>\n<p>As the strong 2005 and 2006 year classes
enter the fishery discarding is expected to further increase, justifying the implementation
of measures to improve gear selectivity, such as increases in mesh size
(<span class=\"caps\">ICES</span>, 2009a).</p></code></pre>
irb(main):014:0" END
=> "<p><span class=\"caps\">ICES</span> evaluated the management plan in 2009 and found it to
be in accordance with the PA. However, the <span class=\"caps\">SSB</span> index , being based
on lengths, excludes the problem connected with age estimation.</p>\n<p><span class=\"caps\">SSB
</span> index is estimated to have decreased by more than 20% between the periods 2010–2012
(average of the three years) and 2013–2014 (average of the two years).</p>\n<p>A candidate
multispecies F<sub><span class=\"caps\">MSY</span></sub> was estimated.</p><pre><code><p>The
management plan, agreed October 2007 and implemented January 2008 was evaluated by <span
class=\"caps\">ICES</span> as to its accordance with the precautionary approach and reviewed
by three independent scientists.</p>\n<p>As the strong 2005 and 2006 year classes enter the
fishery discarding is expected to further increase, justifying the implementation of
measures to improve gear selectivity, such as increases in mesh size (<span class=\"caps\">ICES
</span>, 2009a).</p></code></pre>\n"
Now I want to strip out the text contained in the <pre><code> tags but it fails:
irb(main):015:0> input.gsub(/<pre>.*<\/pre>/,'')
=> "<p><span class=\"caps\">ICES</span> evaluated the management plan in 2009 and found it
to be in accordance with the PA. However, the <span class=\"caps\">SSB</span> index , being
based on lengths, excludes the problem connected with age estimation.</p>\n<p><span
class=\"caps\">SSB</span> index is estimated to have decreased by more than 20% between the
periods 2010–2012 (average of the three years) and 2013–2014 (average of the two years).</p>\n
<p>A candidate multispecies F<sub><span class=\"caps\">MSY</span></sub> was estimated.</p><pre>
<code><p>The management plan, agreed October 2007 and implemented January 2008 was evaluated
by <span class=\"caps\">ICES</span> as to its accordance with the precautionary approach
and reviewed by three independent scientists.</p>\n<p>As the strong 2005 and 2006 year classes
enter the fishery discarding is expected to further increase, justifying the implementation
of measures to improve gear selectivity, such as increases in mesh size (<span class=\"caps\">ICES</span>, 2009a).</p></code></pre>\n"
If I strip out the newlines first, then it does:
irb(main):016:0> input.gsub(/\n/,'').gsub(/<pre>.*<\/pre>/,'')
=> "<p><span class=\"caps\">ICES</span> evaluated the management plan in 2009 and found it
to be in accordance with the PA. However, the <span class=\"caps\">SSB</span> index , being
based on lengths, excludes the problem connected with age estimation.</p><p><span
class=\"caps\">SSB</span> index is estimated to have decreased by more than 20% between the
periods 2010–2012 (average of the three years) and 2013–2014 (average of the two years).</p>
<p>A candidate multispecies F<sub><span class=\"caps\">MSY</span></sub> was estimated.</p>"
What am I missing?