0

I have a text file that containing a pattern [Chinese character]\nRT Journal, and I want to identify this pattern, and substitute it to [The original Chinese character]\n\nRT Journal. I tried the code below but [The original Chinese character] becomes a unicode \x01.

import re
x = "据\nRT Journal"
print(re.sub('([\u4e00-\u9fff])\nRT','\1\n\nRT',x))

It returns '\x01\n\nRT Journal' rather than '据\n\nRT Journal'. But if I replace the in x with an a, I can get what I want. Can you please explain to me a bit why does this happen and how to solve this? Thanks!

Tututuu
  • 105
  • 2

0 Answers0