0

After I updated Pycharm to version 2021.2, whenever I create a new .py file on Pycharm terminal with echo, I can't run it because of the following error:

SyntaxError: Non-UTF-8 code starting with '\xff' in file [file path...] on line 1, but no encoding declared;

After looking it up I'm convinced that pycharm is adding a mandatory BOM to the created file.

Things I tried:

Going to File -> File Properties -> Remove BOM (It's unable to remove)

Going to File -> File Properties -> File Encoding and changing it to UTF-8 (it gives me the following pop-up): enter image description here

Can't convert it either.

Going to Help -> Edit Custom VM Options and adding -Dconsole.encoding=UTF-8 to it

Creating a new python file by clicking works fine.
Creating a new file with echo on cmd terminal works fine too.

What is causing this? How do I solve it? I didn't have this problem before updating Pycharm.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
Laila Campos
  • 801
  • 1
  • 8
  • 21
  • 1
    Are you sure about the problem? The error message tell you that the file is UTF-16LE, so it is not UTF-8. (and do not remove BOM on UTF-16). So your file is not UTF-8, transcode it, before thinking about BOM. -- note: Python is UTF-8 source code by default. If you want UTF-16, you should declare on the beginning of the file (which you didn't, and so the initial error) – Giacomo Catenazzi Aug 09 '21 at 07:44
  • I don't want UTF-16. I just want to create a new file using echo on Pycharm's terminal. I didn't have to declare anything before. I'd like to know how I can go back to doing that. What should I do to achieve that? – Laila Campos Aug 09 '21 at 13:40
  • 1
    Your source file is UTF-16. This is the problem – Giacomo Catenazzi Aug 09 '21 at 13:51
  • But why is it UTF-16 then? How do I make it so whenever I create a new file from Pycharm's terminal it's UTF-8 instead? And why can't I convert it with Pycharm? – Laila Campos Aug 09 '21 at 16:57
  • Why do you use terminal to create .py files? As far I know, PyCharm just use one terminal from operating system, so there are different rules, etc. In general use terminal if you already used it, or if you really need it. I personally like a terminal outside pycharm. You create files with `File -> New`. Depending your terminal and OS, you have different way to convert/transcode files. And do not try to "understand" why microsoft choose to use UTF-16 and other encoding in 2021 – Giacomo Catenazzi Aug 09 '21 at 17:14
  • I can create the files using an outside terminal or just by clicking the IDE, but I'm really used to creating and manipulating files in Pycharm's terminal and I would like to continue to do so. :/ Whenever I create a file using an outside terminal, it's encoded in UTF-8 and I don't have this problem. I would like to understand why this difference exists, not the reason windows uses a particular encoding. – Laila Campos Aug 09 '21 at 17:23
  • Check in Settings -> tools -> terminal. Which terminal do you use. Check the setting on that terminal (or set it to your external terminal). Windows (e.g. powershell) is known to transcode redirections. Just make sure you are using the expected terminal and that terminal as the expected settings. – Giacomo Catenazzi Aug 09 '21 at 17:43
  • I checked it and I'm using powershell.exe. Should I change it to cmd? – Laila Campos Aug 10 '21 at 11:29
  • 1
    I have ....\AppData\Local\Programs\Git\git-cmd.exe (which it use bash, and it is similar to normal unix consoles, and it doesn't transcode redirections) [you have this if you installed git]. Else check any other console. Better the one also used in Unix/Linux (created not to create new surprises, and there is a lot of documentation and help) – Giacomo Catenazzi Aug 10 '21 at 11:48
  • I changed it to cmd.exe and to git-cmd.exe and it worked with both options. Thank you! I'll mark it as solved. – Laila Campos Aug 10 '21 at 12:18

1 Answers1

1

As Giacomo Catenazzi pointed out, changing the terminal in Settings -> tools -> terminal to either cmd.exe or git-cmd.exe (from powershell.exe) worked perfectly.

I can now create files using echo on the terminal again without it being set to UTF-16 and without BOM.

Laila Campos
  • 801
  • 1
  • 8
  • 21
  • If you use cmd you just changed the encoding from utf16 to the default code page (so windows-1252 or similar). This might work correctly as long as you stick with ASCII but that's about it, it's not utf8. PowerShell can save as utf8 and PowerShell core yes utf8 by default. Git-cmd will also work but has other limitations. – Voo Aug 10 '21 at 12:26
  • @voo: which limitation of git-cmd? It uses much standardized tools and it is more compatible on deployment terminals. PowerShell is not so stable: too many changes (and judging the amount of encoding problem and compatibility problem here in SO, it seems a mess). – Giacomo Catenazzi Aug 10 '21 at 12:55
  • @Giacomo PowerShell has been the standard server administration tool for a long time in Windows, so it's very stable. The encoding situation is straight-forward as well (classic PowerShell and below uses UTF16-LE, except if you explicitly tell it to use a different encoding). And the git-cmd problems? Simply try `echo ähem > test.txt` and look if the resulting file looks like correct UTF-8 to you (actually I think git-cmd might have the exact same problems as cmd, I was thinking of git-bash which has its own limitations). – Voo Aug 10 '21 at 13:01
  • If you want valid UTF-8 support, the easiest way is to use PowerShell Core or use a new enough Windows where you can change the codepage to the utf8 codepage (although that's still somewhat experimental and particularly the old shells don't deal very well with it). – Voo Aug 10 '21 at 13:03
  • @Voo: it gives me what I expect (it depends on LANG, but one that use Unicode, set it as `en_US.UTF-8`, which it is pretty standard). *"PowerShell", "PowerShell Core", or recent Windows*. Do you see what I mean? And if I should store the binary values (or just wrong encoding) of my program, I'm sure redirection works, so I can debug. No extra tools to mangle results of a redirection (so making difficult to debug encoding problems) – Giacomo Catenazzi Aug 10 '21 at 13:09
  • @Giacomo With git-cmd? I doubt it (except if you changed the default code page). But I assume you meant git-bash (soo.. I guess, do you see what I mean?). But that one has its own share of problems when interacting with win32 programs. But yes if you don't want to use the default OS encodings you will have extra work to do when using the default shell.. want to try and get bash to use utf16 by default for redirection? – Voo Aug 10 '21 at 13:48