14

When selecting a block of text (possibly spanning across many DOM nodes), is it possible to extract the selected text and nodes using Javascript?

Imagine this HTML code:

<h1>Hello World</h1><p>Hi <b>there!</b></p>

If the user initiated a mouseDown event starting at "World..." and then a mouseUp even right after "there!", I'm hoping it would return:

Text : { selectedText: "WorldHi there!" },
Nodes: [ 
  { node: "h1", offset: 6, length: 5 }, 
  { node: "p", offset: 0, length: 16 }, 
  { node: "p > b", offset: 0, length: 6 } 
]

I've tried putting the HTML into a textarea but that will only get me the selectedText. I haven't tried the <canvas> element but that may be another option.

If not JavaScript, is there a way this is possible using a Firefox extension?

Cheekysoft
  • 35,194
  • 20
  • 73
  • 86
Pras
  • 1,435
  • 1
  • 11
  • 13

5 Answers5

16

You are in for a bumpy ride, but this is quite possible. The main problem is that IE and W3C expose completely different interfaces to selections so if you want cross browser functionality then you basically have to write the whole thing twice. Also, some basic functionality is missing from both interfaces.

Mozilla developer connection has the story on W3C selections. Microsoft have their system documented on MSDN. I recommend starting at PPK's introduction to ranges.

Here are some basic functions that I believe work:

// selection objects will differ between browsers
function getSelection () {
  return ( msie ) 
    ? document.selection
    : ( window.getSelection || document.getSelection )();
}

// range objects will differ between browsers
function getRange () {
  return ( msie ) 
      ? getSelection().createRange()
      : getSelection().getRangeAt( 0 )
}

// abstract getting a parent container from a range
function parentContainer ( range ) {
  return ( msie )
      ? range.parentElement()
      : range.commonAncestorContainer;
}
Borgar
  • 37,817
  • 5
  • 41
  • 42
  • what do you pass to parentContainer as r? I didn't get it working (parentContainer method) – Khaled Al Hourani Jun 10 '09 at 05:38
  • Whoops. Not very clear but that should be a range, I have fixed the variable name. I was thinking it could be used like this: var container = parentContainer( getRange() ); Which is not to say that it will work 100%. The code is intended as an example of the type of work that is necessary to do this and may be found wanting. You'll want to understand the APIs you are dealing with (see links). – Borgar Jun 11 '09 at 00:48
  • `parentContainer()` is unhelpful: the two branches are not guaranteed to return the same thing because the `parentElement()` method of IE's `TextRange` will always return an element while `commonAncestorContainer` could be a text node. Also, there's no need to have any browser sniffing (as implied by use of `msie`): you can easily detect the objects and methods you need. – Tim Down Apr 09 '12 at 16:15
  • This code could cause stack overflow (no pun intended). Take a look at your `getSelection` function; `window.getSelection` and `getSelection` are the same thing, so by overriding the builtin, you cannot access the builtin `getSelection` anymore, rather, your `getSelection` calls itself. – Sapphire_Brick May 24 '20 at 19:56
9

My Rangy library will get your part of the way there by unifying the different APIs in IE < 9 and all other major browsers, and by providing a getNodes() function on its Range objects:

function getSelectedNodes() {
    var selectedNodes = [];
    var sel = rangy.getSelection();
    for (var i = 0; i < sel.rangeCount; ++i) {
        selectedNodes = selectedNodes.concat( sel.getRangeAt(i).getNodes() );
    }
    return selectedNodes;
}

Getting the selected text is pretty easy in all browsers. In Rangy it's just

var selectedText = rangy.getSelection().toString();

Without Rangy:

function getSelectedText() {
    var sel, text = "";
    if (window.getSelection) {
        text = "" + window.getSelection();
    } else if ( (sel = document.selection) && sel.type == "Text") {
        text = sel.createRange().text;
    }
    return text;
}

As for the character offsets, you can do something like this for any node node in the selection. Note this does not necessarily represent the visible text in the document because it takes no account of collapsed spaces, text hidden via CSS, text positioned outside the normal document flow via CSS, line breaks implied by <br> and block elements, plus other subtleties.

var sel = rangy.getSelection();
var selRange = sel.getRangeAt(0);
var rangePrecedingNode = rangy.createRange();
rangePrecedingNode.setStart(selRange.startContainer, selRange.startOffset);
rangePrecedingNode.setEndBefore(node);
var startIndex = rangePrecedingNode.toString().length;
rangePrecedingNode.setEndAfter(node);
var endIndex = rangePrecedingNode.toString().length;
alert(startIndex + ", " + endIndex);
Tim Down
  • 318,141
  • 75
  • 454
  • 536
  • Hello , I am looking at your work. It is really by far the best. I was still having some trouble to get all the selected text though as text comma separated. Can I have some help? – Sara Kat Sep 20 '19 at 14:40
  • Tim, can you give me some hint about restore selection on re-rendered contenteditable div? I have asked a question about it here https://stackoverflow.com/questions/61011651/cant-restore-selection-with-window-getselection-and-range-after-re-rendering . Thank you! – webprogrammer Apr 03 '20 at 12:22
4

This returns the selected nodes as I understand it: When I have

<p> ... </p><p> ... </p><p> ... </p><p> ... </p><p> ... </p>...
<p> ... </p><p> ... </p><p> ... </p><p> ... </p><p> ... </p>

a lot of nodes and I select only a few then I want only these nodes to be in the list.

function getSelectedNodes() {
  // from https://developer.mozilla.org/en-US/docs/Web/API/Selection
  var selection = window.getSelection();
  if (selection.isCollapsed) {
    return [];
  };
  var node1 = selection.anchorNode;
  var node2 = selection.focusNode;
  var selectionAncestor = get_common_ancestor(node1, node2);
  if (selectionAncestor == null) {
    return [];
  }
  return getNodesBetween(selectionAncestor, node1, node2);
}

function get_common_ancestor(a, b)
{
    // from http://stackoverflow.com/questions/3960843/how-to-find-the-nearest-common-ancestors-of-two-or-more-nodes
    $parentsa = $(a).parents();
    $parentsb = $(b).parents();

    var found = null;

    $parentsa.each(function() {
        var thisa = this;

        $parentsb.each(function() {
            if (thisa == this)
            {
                found = this;
                return false;
            }
        });

        if (found) return false;
    });

    return found;
}

function isDescendant(parent, child) {
     // from http://stackoverflow.com/questions/2234979/how-to-check-in-javascript-if-one-element-is-a-child-of-another
     var node = child;
     while (node != null) {
         if (node == parent) {
             return true;
         }
         node = node.parentNode;
     }
     return false;
}

function getNodesBetween(rootNode, node1, node2) {
  var resultNodes = [];
  var isBetweenNodes = false;
  for (var i = 0; i < rootNode.childNodes.length; i+= 1) {
    if (isDescendant(rootNode.childNodes[i], node1) || isDescendant(rootNode.childNodes[i], node2)) {
      if (resultNodes.length == 0) {
        isBetweenNodes = true;
      } else {
        isBetweenNodes = false;
      }
      resultNodes.push(rootNode.childNodes[i]);
    } else if (resultNodes.length == 0) {
    } else if (isBetweenNodes) {
      resultNodes.push(rootNode.childNodes[i]);
    } else {
      return resultNodes;
    }
  };
 if (resultNodes.length == 0) {
    return [rootNode];
  } else if (isDescendant(resultNodes[resultNodes.length - 1], node1) || isDescendant(resultNodes[resultNodes.length - 1], node2)) {
    return resultNodes;
  } else {
    // same child node for both should never happen
    return [resultNodes[0]];
  }
}

The code should be available at: https://github.com/niccokunzmann/spiele-mit-kindern/blob/gh-pages/javascripts/feedback.js

I posted this answer here because I would have liked to find it here.

User
  • 14,131
  • 2
  • 40
  • 59
0

All standards compliant code that works in IE11+.

Text String

window.getSelection().getRangeAt(0).toString()

The start node (even if the text is selected backwards):

window.getSelection().anchorNode

The end node (even if the text is selected backwards):

window.getSelection().focusNode

Would you like to know more? Select some text and run the following JavaScript in the console:

console.log(window.getSelection());
console.log(window.getSelection().getRangeAt(0));
John
  • 1
  • 13
  • 98
  • 177
0

There is a much shorter way if you just want the range.

function getRange(){
    return (navigator.appName=="Microsoft Internet Explorer")
        ? document.selection.createRange().parentElement()
        : (getSelection||document.getSelection)().getRangeAt(0).commonAncestorContainer
}
JamesBond
  • 9
  • 1
  • 2
    This is not ideal. First, the browser sniff is unhelpful, since IE 9 and later support standard `Selection` and `Range` APIs and you just as easily detect the features you need directly. Second, the two branches are not guaranteed to return the same thing: the `parentElement()` method of IE's `TextRange` will always return an element while `commonAncestorContainer` could be a text node. Third, the naming is odd: what the function returns is a node, not a Range. – Tim Down Apr 09 '12 at 16:11
  • Shorter is not necessarily better. – Stefan Falk Mar 13 '16 at 20:16