0

I am trying to get what a person says in Arabic and then print that to the terminal. When I try to speak to the mic, any Arabic word I say gets printed as ???? with the number of question marks depending on the number of letters of what I said. I added the print(get_display(arabic_reshaper.reshape("مرحبا"))) to see if I can print Arabic characters in general and this is what shows in my terminal:

Listening...

Error: 'charmap' codec can't encode characters in position 0-4: character maps to <undefined>

Say that again...

I have already set my settings of the text editor to be of UTF-8. This is my code:

import speech_recognition as sr
import arabic_reshaper
import sys
from bidi.algorithm import get_display

def command():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Listening...")
        r.pause_threshold = 0.6
        audio = r.listen(source)

        try:
            print(get_display(arabic_reshaper.reshape("مرحبا")))
            ask = r.recognize_google(audio, language='ar-SA')
            reshaped_ask = arabic_reshaper.reshape(ask)
            bidi_text = get_display(reshaped_ask)
            try:
                if sys.stdout.encoding.lower() == 'utf-8':
                    print(bidi_text)
                else:
                    print(bidi_text.encode(sys.stdout.encoding, errors='replace').decode(sys.stdout.encoding))
            except UnicodeEncodeError:
                print(bidi_text.encode(sys.stdout.encoding, errors='replace').decode(sys.stdout.encoding))
        except Exception as e:
            print("Error:", str(e))
            print("Say that again...")
            return ""

        return ask

command()

When I try to run the following code:

 reshaped_text = arabic_reshaper.reshape("مرحبا")
 bidi_text = get_display(reshaped_text)
 print(bidi_text)

I get مرحبا printed from left to right and not right to left.

tripleee
  • 175,061
  • 34
  • 275
  • 318
sami
  • 15
  • 5
  • 1
    does this happen if you run your program in a shell outside of VS Code? I have a (totally unsubstantiated) hunch that the issue here is not related to VS Code. Or I guess I'm also confused about what the problem really is. Is the problem how you see the arabic text? Or is the problem the error message you're showing us? ("Error: 'charmap' codec can't encode characters in position 0-4 ...") – starball May 27 '23 at 07:03
  • Trying to use Unicode on Windows is a common beginner FAQ. See e.g. https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console – tripleee May 29 '23 at 06:14
  • The speech recognition parts seem unrelated to your actual problem. Please review the guidance for providing a [mre] – tripleee May 29 '23 at 06:20
  • I'm not sure about this question closure. I'd like to see a [mre] and more details first. – starball May 29 '23 at 07:12
  • (1) most terminals do not have bidi or complex rendering support. (2) `arabic_reshaper` and `bidi.algorithm` are hacks that only work for some languages, e.g. they will work for Arabic and Persian, fail for Kurdish and many other languages. (3) Best to use a solution that uses the HTML/CSS/JS stack. For vscode, you have two options using the Jupyter extensions: use a Jupyter notebook in vscode, or alternatively run a py file in the IPython interactive terminal. Both approaches will correctly render Arabic. – Andj May 30 '23 at 03:18

2 Answers2

1

It doesn't work because your terminal don't support it. We can print those characters but they appear disjointly. To work with Arabic letters, I will suggest you to use python-eel or python-electron. These packages allow you to use html, css and js for front end of a GUI application so you can display the Arabic letters.

tripleee
  • 175,061
  • 34
  • 275
  • 318
Moeez Raza
  • 32
  • 4
  • 3
    Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community May 27 '23 at 12:35
0

Vscode uses the system terminal as an integrated terminal, so the terminal character display may be related to the system environment language, vscode display language, the set terminal language, terminal code page and many other things. You may have to tweak each one until you find the right configuration to display the characters you want.

  • You can also install the Code Runner extension, and then select Run Code to execute the script, so the result will be displayed in the OUTPUT panel.

  • You can also configure the console in launch.json to be externalTerminal, and then use Run --> Run Without Debugging, so that the result will be displayed on the external terminal.

JialeDu
  • 6,021
  • 2
  • 5
  • 24