3

I tried the code below

str = 'Arc's 弧'
str.replace(/[^a-z\d\s]+/gi,'');

The result show

Arc039s

Expected result

Arc's

What's wrong with the code, and is that the right way for removing chinese character including punctuation?

Thanks in advance.

Crazy
  • 847
  • 1
  • 18
  • 41
  • 1
    You sure the actual string wasn’t `"Arc's 弧"`? Is your input HTML? And why do you want to remove those characters? – Ry- Aug 28 '17 at 04:03
  • @Ryan yes, i get it from database and want to share some text via whatsapp, if contain chinese character whatsapp unable to send the text properly. – Crazy Aug 28 '17 at 04:05
  • 1
    Have try to escape Chinese character instead of remove it? – The KNVB Aug 28 '17 at 04:08
  • escaping should work, I also faced the same problem with another unicode based language. escaping worked for me – Amogh Aug 28 '17 at 04:22
  • and if you really want to remove non-ascii characters the https://stackoverflow.com/questions/20856197/remove-non-ascii-character-in-string may help you – Amogh Aug 28 '17 at 04:24
  • how escape() work? Is it same like replace()? I just want to send text via whatsapp by trim all chinese character but replace() method seem not working. – Crazy Aug 28 '17 at 06:47

1 Answers1

3

Check this

var  str = "Arc's 弧"
alert (str);
// by your regex
alert (str.replace(/[^a-z\d\s]+/gi,''));
// by new regex which removes non-ascii characters
alert(str.replace(/[^\x00-\x7F]/g, ""));

str.replace(/[^a-z\d\s]+/gi,'') will remove chinese character but it will also remove ' so your new string will will be Arcs but by str.replace(/[^\x00-\x7F]/g, "") regex it will remove non-ascii characters and new string will be Arc's.

http://jsfiddle.net/yjcL5/104/

Amogh
  • 4,453
  • 11
  • 45
  • 106
  • It is not working in my case, i only able to display Arc and the characters left behind will not able to be displayed if i share text to whatsapp. – Crazy Aug 28 '17 at 05:45
  • @Zidance To understand better, create a fiddle with exact same code that you are using – Amogh Aug 28 '17 at 09:17
  • I can't create a jsfiddle because it is caused when display the data from my database. If i am creating a sample data with apostrophe, i don't see any problem. – Crazy Aug 28 '17 at 09:23
  • I am using phpmyadmin and the collation is latin1_swedish_ci. Is this related? – Crazy Aug 28 '17 at 09:26
  • I can't tell what is happening by actually looking it but still on a note what you are saying just make sure value retrieved from DB is UTF8/unicode, mention charset=utf8 for mysql connection string. and yes DB collation also plays important role. I suggest to try with `utf_unicode_ci` – Amogh Aug 28 '17 at 09:30