1

Isn't the diacritical mark above "a" should be removed by the Regex?

 "hǎo".gsub(/\p{Nonspacing_Mark}/, '')
 => "hǎo" 

 "hǎo".gsub(/\p{Mn}/, '')
 => "hǎo" 

Update:

I kind of get it from how it works in Java.

Normalizer.normalize("hǎo", Form.NFD).replaceAll("\\p{Mn}+", "")

I need to normalizer it first to split the "ǎ" into "a" and the diacritical mark.

Cheng
  • 4,816
  • 4
  • 41
  • 44

1 Answers1

0
puts UnicodeUtils.nfkd("ﻺ (hǎo)").gsub(/[\p{Nonspacing_Mark}]/, '')

See How to replace the Unicode gem on Ruby 1.9?

Community
  • 1
  • 1
Cheng
  • 4,816
  • 4
  • 41
  • 44