The input content is a chunk of html copied from webkit window, like
It's displayed correctly in web-kit using UTF-8.
What I want to do is to replace all the tags, I use this one-liner:
perl -i -pe "s/<img.+?>//g"
The input is the richtext I copied to my clipboard and redirected into this one-liner by another program, probably it's something like:
echo "rich html text" | perl -i -pe "s/<img.+?>//g"
Well, it does remove the <img>
tags, but all the Unicode characters get corrupted after substitution.
I am on Windows 7, locale En - US. The cmd codepage has already been set to UTF-8.
It doesn't work even if I pass the -C
option.
Is there a way to keep the code as one-liner while make it working for Unicode input?