Need Reglar Expression to find English words in Chinese language

Question

I need reg expression to find the English text from Chinese and add a class for it.

Example: Input

<p>当然，你要学习<a href='#' target='_blank'>“<b>Megento</b>”</a></p>

Output Should be:

<p>当然，你要学习<a href='#' target='_blank'>“<b><span class="english">Megento</span></b>”</a></p>

You probably mean 'Latin characters' rather than 'English text'. — Joe, Sep 05 '13 at 08:45
Your question is too confusing in the way it is. Can u elaborate what you really want. — MarsOne, Sep 05 '13 at 08:46
It depends on what type of encoding you use. Can you append it? — Drazzi, Sep 05 '13 at 08:56
I have made the regular expression as `/(<[/\w :/{}%"'=),;.\-]+>)|([\w :/{}%"'=(),.&]+)/g` it worked for me till now. — Cader, Sep 11 '13 at 04:59

score 0 · Answer 1 · edited May 23 '17 at 11:56

.NET regular expressions can match based on Unicode character ranges (see Unicode Category or Unicode Block: \p{}). For example the regex \p{IsBasicLatin} will match x, but not Ǝ (U+018E: Latin Capital Letter Reversed E).

Using this to match the text content of elements is therefore quite possible.

But don't use regex to parse the HTML itself. Use an HTML parser to process the HTML and then the regex to look at the text content.

Need Reglar Expression to find English words in Chinese language

1 Answers1