1

I'm trying to make a string replace on text in order to highlight substrings.

Those paragraphs however may include nested inline forms which also potentially contain the text string to replace. Replacing the string there obviously busts the form.

This is why I'm looking for a way in PHP to limit string replacement to all occurrences in a tag (p) which are NOT inside a nested tag (form).

Example code would for illustration would be something like this:

<p>A hedgehog is any of the spiny mammals of the subfamily Erinaceinae and the order Erinaceomorpha. There are 17 <form  style="display: inline" id="lnk_123" action="hedgehog_species.html">
<input type="hidden" name="id" value="1" />
<input type="hidden" name="gps_src_lbl" value="hedgehog" />
<button type="submit" name="submit">species</button></form> of hedgehog in five genera, found through parts of Europe, Asia, Africa, and New Zealand).</p>

If I made a string replace on "hedgehog", it'd make the form useless as it would replace part of the action attribute, and the hidden input field.

Thanks in advance for your help

Argoron
  • 749
  • 2
  • 10
  • 26
  • You probably want to use some regular expressions. Maybe someone has one for you here but you can try it yourself first. I'm not a big hero in regex. – Bas Slagter Nov 23 '11 at 10:20
  • Some code example would help to understand your problem. – Eugene Nov 23 '11 at 10:24
  • 1
    @Baszz why, why, why do you suggest a regex? – CodeCaster Nov 23 '11 at 10:35
  • possible duplicate of [find and replace keywords by hyperlinks in an html fragment, via php dom](http://stackoverflow.com/questions/3151064/find-and-replace-keywords-by-hyperlinks-in-an-html-fragment-via-php-dom) – Gordon Nov 23 '11 at 14:30
  • @Argoron your example html and your description is somewhat ambigous. I understood you dont want to replace in `

    hedgehog

    ` when you just want to replace in *any* p elements. Note that [DOM operates on Nodes](http://stackoverflow.com/questions/4979836/noob-question-about-domdocument-in-php/4983721#4983721), not strings. I have removed my answer and linked a possible duplicate. In case you have trouble getting it to work have a look at http://codepad.org/mNV9NS7f
    – Gordon Nov 23 '11 at 14:46
  • @Gordon - I did indeed have a lot of trouble, as I never used DOM or XPath. Btw you understood correctly, sorry for not having been clear from the start. Every occurrence within a form tag should be ignored by the process. Will check he code you provided now. Thanks !! – Argoron Nov 23 '11 at 14:55
  • @Gordon - sorry but I seem to be getting nowhere with this. I reckon it works fine in the link you gave me, but I get an encoding error message, an "unexpected end tag p in Entity..." and, honestly, not sure what that "<<< HTML" is, if it is important to put it that way and, if so, how I use it for dynamic text. – Argoron Nov 23 '11 at 16:05
  • @Argoron the [`<<<` implies a HEREDOC block](http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc). You dont need it. The error you get is likely because you are trying to pass a document without a root node. loadXML expects valid XML. If you dont have that, use loadHTML. – Gordon Nov 23 '11 at 16:13
  • @Gordon - did the loadXML to loadHTML change, and replaced the heredoc block. After also having replaced saveXML with saveHTML, I actually got a result, i.e. highlighting, no encoding problems. Still have the following issues: 1) error message: DOMDocument::loadHTML() [domdocument.loadhtml]: Unexpected end tag : p in Entity, line: 1 2) incorrect replacement. To state a real life example: highlighting of "oblig" in "obligation" results in "oblolbigon" (oblig correctly highlighted). 3) This method actually adds html and body tags, doctype and whatnot. Text is just a paragraph, not a whole doc. – Argoron Nov 23 '11 at 16:40
  • @Argoron yes, that's all expected. check `libxml_use_internal_errors` to suppress the errors and search SO for innerHTML and outerHTML with PHP's DOM. – Gordon Nov 23 '11 at 16:45

0 Answers0