I have a text file that containing a pattern [Chinese character]\nRT Journal
, and I want to identify this pattern, and substitute it to [The original Chinese character]\n\nRT Journal
. I tried the code below but [The original Chinese character] becomes a unicode \x01.
import re
x = "据\nRT Journal"
print(re.sub('([\u4e00-\u9fff])\nRT','\1\n\nRT',x))
It returns '\x01\n\nRT Journal'
rather than '据\n\nRT Journal'
. But if I replace the 据
in x
with an a
, I can get what I want. Can you please explain to me a bit why does this happen and how to solve this? Thanks!