1

I have to match Arabic text and highlight them with the current script i can highlight or wrap them in a link for only single words, How to modify this script so that it can match multiple words also.

text = text.replace(
    /([\u0600-\u06ff]+)([^\u0600-\u06ff]+)?/g,
replacer);

Complete Code

(function () {
    // Our keywords. There are lots of ways you can produce
    // this map, here I've just done it literally
    //الولايات المتحدة الأمريكية, الولايات المتحدة, اوباما, وأمريكا, والتفاوض, وإيران, الاتفاق النووي, الخليج العربي, الخليج الفارسي

    var keywords = {
        "الخليج العربي": true,
            "الاتفاق النووي": true,
            "الخليج العربي": true,
            "وإيران": true,
            "والتفاوض": true,
            "وأمريكا": true,
            "اوباما": true,
            "الولايات المتحدة": true,
            "الولايات المتحدة الأمريكية": true
    };

    // Loop through all our paragraphs (okay, so we only have two)
    $("p").each(function () {
        var $this, text;

        // We'll use jQuery on `this` more than once,
        // so grab the wrapper
        $this = $(this);

        // Get the text of the paragraph
        // Note that this strips off HTML tags, a
        // real-world solution might need to loop
        // through the text nodes rather than act
        // on the full text all at once
        text = $this.text();

        // Do the replacements
        // These character classes just use the primary
        // Arabic range of U+0600 to U+06FF, you may
        // need to add others.
        text = text.replace(
            /([\u0600-\u06ff]+)([^\u0600-\u06ff]+)?/g,
        replacer);

        // Update the paragraph
        $this.html(text);
    });

    // Our replacer. We define it separately rather than
    // inline because we use it more than once      
    function replacer(m, c0, c1) {
        // Is the word in our keywords map?
        if (keywords[c0]) {
            // Yes, wrap it
            c0 = '<a class="red" href="#">' + c0 + '</a>';
        }
        return c0 + c1;
    }
})();

Fiddle example: http://jsfiddle.net/u3k01bfw/1/

actual source Text Matching not working for Arabic issue may be due to regex for arabic

i simply want to match the keywords and wrap them around HTML which can be <span></span> or <a href="#"> <a/> for highlight the matched keywords or converting them in links

Update: I had also used other plugins like Highlight but that also breaks for arabic, As one of the users had recommended the Highlight plugin as a solution but it breaks as mentioned in this question raised last week https://stackoverflow.com/questions/29533793/highlighting-of-text-breaks-when-for-either-english-or-arabic.

I also have other issue with the approach which i have take,

  1. It works properly if Arabic words are wrapped in " or separated by , or and sometime last word doesn't match if i remove \s from the regex
  2. There may be other condition where it may breaks so far i have tried to fix one issue but then other thing breaks.

I would appreciate help in this regard, I simply want to match exact Arabic keywords using any plugin which works properly, so far i have tried few option but they have one of the other issues

Community
  • 1
  • 1
Learning
  • 19,469
  • 39
  • 180
  • 373

1 Answers1

1

The problem with your approach isn't the regex. You match each word individually, and then check if that word is a keyword. If keywords contains something that isn't a word you will never match it.

One option is to change the regex pattern to be based on your keywords.

For example:

var keywords = [ "الخليج العربي","الاتفاق النووي","الخليج العربي",
                 "وإيران","والتفاوض","وأمريكا","اوباما",
                 "الولايات المتحدة الأمريكية","الولايات المتحدة"];
var keywordRegexp = new RegExp(keywords.join("|"), 'g');

and then:

text = text.replace(keywordRegexp, '<a class="red" href="#">$&</a>');

or:

function replacer(g0) {
    return '<a class="red" href="#">' + g0 + '</a>';
}
text = text.replace(keywordRegexp, replacer);

Some notes on that:

  • I've changed keywords to an array for convenience.
  • If your keywords may contain RegExp metacharacter, you may want to escape them. Maybe not.
  • You may need to reorder keywords so that longer words come first. Specifically, "hello world" should be before "hello".

Working example: http://jsfiddle.net/u3k01bfw/5/


I would also mention the jQuery Highlight Plugin, which makes short work of this problem:

$("p").highlight(keywords);

Working example: http://jsfiddle.net/u3k01bfw/6/

Community
  • 1
  • 1
Kobi
  • 135,331
  • 41
  • 252
  • 292
  • I tried Highlight plugin also problem with it was that it matched any combination, I had also posted question regarding this http://stackoverflow.com/questions/29533793/highlighting-of-text-breaks-when-for-either-english-or-arabic. – Learning Apr 12 '15 at 05:07
  • If i add keywords with `` then it breaks http://jsfiddle.net/u3k01bfw/7/ and usually it doesn't match the last word – Learning Apr 12 '15 at 05:12
  • @KnowledgeSeeker - Can you please edit the question to include all of these cases? – Kobi Apr 12 '15 at 05:20