I'm trying to parse some Japanese text, and I can't seem to figure out the output encoding.
This is the output I'm getting:
これは ̾��,����,*,*,*,*,*
本 ̾��,����,*,*,*,*,*
です ̾��,����,*,*,*,*,*
。 ̾��,������³,*,*,*,*,*
EOS
Steps I took:
git clone https://github.com/taku910/mecab
cd mecab/mecab
./configure --enable-utf8-only --with-charset=utf8
make
sudo make install
mecab -o ~/Desktop/output.txt ~/Desktop/input.txt
, whereinput.txt
contains "これは本です。"
Using OSX 10.15.3