0

I have an HTML article with some annotations that I retrieve with SPARQL queries. These annotations refer to some text in the document, and I have to highlight this text (wrapping it in a span).

I had already asked how to wrap text in a span, but now I have a more specific problem that I do not know how to solve. The code I wrote was:

var currentText = $("#"+v[4]["element"]+"").text();
var newText = currentText.substring(0, v[5]["start"]) + "<span class=' annotation' >" + currentText.substring(v[5]["start"], v[6]["end"]) + "</span>" + currentText.substring(v[6]["end"], currentText.length);
$("#"+v[4]["element"]+"").html(newText);

Where:

v[4]["element"] is the id of the parent element of the annotation

v[5]["start"] is the position of the first character of the annotation

v[6]["end"] is the position of the last character of the annoation

Note that start and end don't consider html tags.

In fact my mistake consists in extracting data from the node with the text() method (to be able to go back to the correct position of the annotation) and put back with the html() method; but in this manner if parent node has children nodes, they will be lost and overwritten by simple text.

Example: having an annotation on '2003'

<p class="metadata-entry" id="k673f4141ea127b">
    <span class="generated" id="bcf5791f3bcca26">Publication date (<span class="data" id="caa7b9266191929">collection</span>): </span>
    2003
</p>

It becomes:

<p class="metadata-entry" id="k673f4141ea127b">
    Publication date (collection): 
    <span class="annotation">2003</span>
</p>

I think I should work with nodes instead of simply extract and rewrite the content, but I don't know how to identify the exact point where to insert the annotation without considering html tags and without eliminating child elements.

I read something about the jQuery .contents() method, but I didn't figure out how to use it in my code.

Can anyone help me with this issue? Thank you

EDIT: Added php code to extract body of the page.

function get_doc_body(){
    if (isset ($_GET ["doc_url"])) {

        $doc_url = $_GET ["doc_url"];
        $doc_name = $_GET ["doc_name"];

        $doc = new DOMDocument;
        $mock_doc = new DOMDocument;

        $doc->loadHTML(file_get_contents($doc_url.'/'.$doc_name));
        $doc_body = $doc->getElementsByTagName('body')->item(0);
        foreach ($doc_body->childNodes as $child){
            $mock_doc->appendChild($mock_doc->importNode($child, true));
        }
        $doc_html = $mock_doc->saveHTML();
        $doc_html = str_replace ('src="images','src="'.$doc_url.'/images',$doc_html);

        echo($doc_html);
    }

}
Gio Bact
  • 541
  • 1
  • 7
  • 23
  • You will have to iterate over all text nodes. For each text node, split it using the words as separators. Iterate the result and generate a text node is the string is not a word and span element if it is one. Insert the news nodes before the original text node. Last remove the original text node. – ThW Jan 26 '15 at 14:18
  • Sorry I didn't understand your answer very well, can you please give me a short snippet applied to the `

    ` element in my example?

    – Gio Bact Jan 26 '15 at 14:30
  • 1
    Not really, that's why this is an comment, not an answer. You need to start differently. Do not match the element nodes and fetch their 'text' but fetch the text nodes directly. The text nodes contain the words you want to highlight and need to be replaced with new nodes. I implemented this in PHP some years ago... – ThW Jan 26 '15 at 14:59
  • Ok, I found a function in [this](http://stackoverflow.com/questions/2525368/loop-through-text-nodes-inside-a-div) question and I tried using it in [Fiddle](http://jsfiddle.net/5k0ge28c/9/). Variables are correct, but it seems never find a textnode, so it never changes the content. Can you help me? – Gio Bact Jan 26 '15 at 15:26

2 Answers2

4

Instead of doing all these, you can either use $(el).append() or $(el).prepend() for inserting the <span> tag!

$("#k673f4141ea127b").append('<span class="annotation">2003</span>');

Or, If I understand correctly, you wanna wrap the final 2003 with a span.annotation right? If that's the case, you can do:

$("#k673f4141ea127b").contents().eq(1).wrap('<span class="annotation" />');

Fiddle:

$(document).ready(function() {
  $("#k673f4141ea127b").contents().eq(1).wrap('<span class="annotation" />');
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<p class="metadata-entry" id="k673f4141ea127b">
    <span class="generated" id="bcf5791f3bcca26">Publication date (<span class="data" id="caa7b9266191929">collection</span>): </span>
    2003
</p>
Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252
0

At the end my solution is in this Fiddle.

Generalizing:

        var element = document.getElementById(id);
        var totalText = element.textContent;
        var toFindText = totalText.substring(start,end);
        var toReplaceText = "<span class='annotation'>"+toFindText+"</span>";
        element.innerHTML = element.innerHTML.replace(toFindText, toReplaceText);

Hope it could help someone else.

Note: This don't check if two or more annotations refers to the same node, I'm working on it right now.

Gio Bact
  • 541
  • 1
  • 7
  • 23