I have very large HTML document, containg plenty of paragraps. For headings is used UPPER CASE text within paragraphs.
How to find all paragraphs containing UPPER CASE text and apply style to these paragraphs?
There is also a plenty extra spacing between text in most of paragraphs. Sample of existing headings:
<p> </p>
<p> USU EA EUISMOD HONESTATIS DETERRUISSET.</p>
<p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. Ea vim brute labore. Vim te esse libris erroribus, ex minimum tacimates dissentiet duo. Ignota iisque in mei, pri sanctus albucius omnesque id. Laoreet docendi theophrastus ei pri, duo wisi tollit decore ea, tempor doctus vivendo sed ad. </p>
<p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. Cum sadipscing consectetuer cu, an nominavi consulatu adversarium sea, nam ad dico evertitur voluptaria. Id justo viderer bonorum per, in ius impedit tincidunt, nec et quis scaevola. Cu congue iriure scaevola usu. Ei elit reformidans suscipiantur eos, cum ut doming iracundia. </p>
<p> </p>
<p> CU CONGUE IRIURE SCAEVOLA --
UT DOMING IRACUNDIA. </p>
<p> DICO TEMPOR HABEMUS.</p>
<p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
I want apply style to UPPER CASE text (headings) inside paragraphs tags to make them bold (headings).
Above block should look like below after running the regular expression replace(s) or the UltraEdit macro:
<p> </p>
<p class="bold"> USU EA EUISMOD HONESTATIS DETERRUISSET.</p>
<p>Qualisque mnesarchum no nam, usu cu fastidii delicata. Eu mei nonumy libris, quas movet vivendo vim at. Prima epicuri conceptam pro ad, in suas nonumes similique duo. Qui mundi essent complectitur eu. Ei laudem veritus democritum vis, te ferri appareat eos. Ceteros pertinacia ea eum, quo integre theophrastus ex, eum et sint omnes detracto. Ea vim brute labore. Vim te esse libris erroribus, ex minimum tacimates dissentiet duo. Ignota iisque in mei, pri sanctus albucius omnesque id. Laoreet docendi theophrastus ei pri, duo wisi tollit decore ea, tempor doctus vivendo sed ad. </p>
<p>Usu ea euismod honestatis deterruisset. Ne quo malis meliore, duo viris liberavisse no, mea an vide mutat quodsi. Vis an vidit debitis, et noster aliquam pri, case iudicabit te sea. Cum sadipscing consectetuer cu, an nominavi consulatu adversarium sea, nam ad dico evertitur voluptaria. Id justo viderer bonorum per, in ius impedit tincidunt, nec et quis scaevola. Cu congue iriure scaevola usu. Ei elit reformidans suscipiantur eos, cum ut doming iracundia. </p>
<p> </p>
<p class="bold"> CU CONGUE IRIURE SCAEVOLA --
UT DOMING IRACUNDIA. </p>
<p class="bold"> DICO TEMPOR HABEMUS.</p>
<p>Homero everti ei nam. An liber euripidis vis, pericula persecuti deseruisse ad mea. Dicant offendit sea et, per esse timeam deserunt ut. In pri enim sadipscing, ei movet soleat suavitate vim. Mea et omnesque phaedrum, paulo luptatum concludaturque vim ea. -- LIBER. </p>
As some paragraphs contain mixed upper case and lower case text, we need limit regex to search only paragraphs containing all UPPER CASE text, without lower case letters. There can be also line breaks within a paragraph.
How to accomplish this using some macro or code in UltraEdit for Linux? (Or Windows version as regex are the same anyway.)
I want apply class to paragraphs (instead of make headers H1, H2, etc.) just due to ebook readers (Kindle, etc.) may display headers in unpredictable way. Document encoding is utf-8, Cyrillic charset.
\W*?[[:upper:]][^[:lower:]]+?
)` and `\W*?[[:upper:]][[:upper:]\W]*?
)` and find that they search paragraphs with mixed text and omit with all UPPER CASE. Same thing with `\W*?\u[\u\W]*?
)` regex. Possibly due to different Perl regular expression library used in UE Linux versions? – fxgreen Jul 29 '16 at 13:44