the real issue may be more complicated, but for now, I'm trying do accomplish something a bit easier. I'm trying to remove space in between 2 Chinese/Japanese characters, but at the same time maintaining the space between a number and a character. An example below:
text = "今天特别 热,但是我买了 3 个西瓜。"
The output I want to get is
text = "今天特别热,但是我买了 3 个西瓜。"
I tried to use Python script and regular expression:
import re
text = re.sub(r'\s(?=[^A-z0-9])','')
However, the result is
text = '今天特别热,但是我买了 3个西瓜。'
So I'm struggling about how I can maintain the space between a character and a number at all time? And I don't want to use a method of adding a space between "3" and "个".
I'll continue to think about it, but let me know if you have ideas...Thank you so much in advance!