While trying to implement codes given as examples in a book for NLTK in python running directly on PowerShell, some characters are not getting printed. The version of Python is 3.6.0 and the encoding is thus UTF-8 as needed. The problem is that the command line output of a text encoded in UTF-8 is not being displayed because of probably a different console encoding.
I think I saw one post similar to this which was enquiring about Russian letters but it was specific to Java and Linux. It gave me the idea to look for console encoding settings and changing it to UTF-8. But I am unable to find those settings.
>>> import nltk
>>> nltk.download('cess_esp')
>>> nltk.corpus.cess_esp.words()
['El', 'grupo', 'estatal', 'Electricité_de_France', ...]
>>> nltk.download('indian')
>>> nltk.corpus.indian.words()
['মহিষের', 'সন্তান', ':', 'তোড়া', 'উপজাতি', '৷', ...]
As shown in the code, I try to print out 2 kinds of words, Spanish and Indian (Devnagri). But only the output for Spanish is printed out correctly, while for Indian it shows blank boxes/squares in place of the letters. However, when I copy and paste the 'blank-boxes output' for Indian in Chrome address bar or in this post, for example, it prints it out correctly.
Edit: The suggested possible duplicate query (Displaying Unicode in Powershell) deals with the same problem except, it suggests the font that will work for Arabic, Chinese, Japanese, and Russian characters. I tried that font in my case as well, feeling a little lucky. Unfortunately, it didn't work!