0

I have some text content that I am capturing from a third-party source, and which sometimes contains emoji, represented as image elements. I find each of the emoji image elements, and convert them to the unicode character for that emoji using the following code:

$(this).find('img.emoji').each(function(i){
    emoji = decodeURIComponent($(this).data('textvalue'));
    $(this).replaceWith(emoji);
});

However, the text immediately preceding each emoji image element contains an extra whitespace character, right before the emoji. See:

'[...] blah blah blah  <img class="emoji" data-textvalue="%F0%9F%98%92">'

but it should be:

'[...] blah blah blah <img class="emoji" data-textvalue="%F0%9F%98%92">'

Because this is coming from a third-party source, I have no control over the original copy. But, I would like to remove that extra whitespace character in each instance of an emoji image (whether before or after converting it to unicode doesn't matter, but I suspect it may be easier to do before). How do I accomplish this?

One idea I had was to possibly get the character location of the beginning of the image element using javascript's str.indexOf, and then delete the character that was 1 less than that. But that would require converting the parent element to a string, and would cause problems if the intial text itself contained the phrase "<img", as unlikely as that would be.

Is there an easy way to do this that I am missing?

GtwoK
  • 497
  • 4
  • 16
  • From your given example can you show the desired result, just so I can understand clearly please – EagerMike Jun 14 '15 at 22:11
  • @jammycoder updated to show a better example. All I'm trying to do is really just delete a single whitespace before each `$('img.emoji')` element. – GtwoK Jun 14 '15 at 22:17
  • A simple regex might work - `string.replace(/[ ]{2,}( –  Jun 14 '15 at 22:28
  • How about removing all double spaces first see question http://stackoverflow.com/questions/3286874/remove-all-multiple-spaces-in-javascript-and-replace-with-single-space then use your function – EagerMike Jun 14 '15 at 22:33

1 Answers1

0

I'd break out of jQuery here and use native Javascript - it's better for cases where you've got tags splashed about text like that.

The best way to think about it (this is the browser's internal representation) is the bits of un-tagged text actually have a special invisible tag around them, so instead of

<div>I like ice-cream! <img src='ice-cream'></img> it's so yummy!</div>

You've really got

<div>
    <textnode>I like ice-cream! </textnode>
    <img src='ice-cream'></img>
    <textnode> it's so yummy!</textnode>
</div>

Javascript will let you loop through all these different elements, and trim the ones that are just before <img> tags. Something like this should work (the get(0) just gets the jQuery element as a native javascript one):

var childNodes = $(this).get(0).childNodes;
//start at 1 instead of 0 - first node is irrelevant here
for (var i=1; i<childNodes.length; i++) {
    var node = childNode[i];
    if ( isNodeAnImg( node ) ) {
        var previousNode = childNodes[i-1];
        if ( isNodeATextNode() ) {
            stripTrailingSpaceFrom( previousNode );
        }
    }
}

function isNodeAnImg(node) {
    return (node.nodeType == Node.ELEMENT_NODE && node.nodeName == "img");
}

function isNodeATextNode(node) {
    previousNode.nodeType == Node.TEXT_NODE
}

function stripTrailingSpaceFrom( node ) {
    var text = node.textContent;
    var lastCharacter = text.charAt( text.length - 1 );
    if ( lastCharacter === ' ' ) {
        node.textContent = text.substring(0, text.length - 1);
    }
}
Duncan Thacker
  • 5,073
  • 1
  • 10
  • 20