1

I am using the current VSCode (1.50.1) on Windows10, with the Powershell Integrated Console:

  • =====> PowerShell Integrated Console v2020.6.0 <=====

I am unable to reference a file path that has Hebrew letters in it. Why?

With only English in the path, my code works fine:

$DestinationDirMP3 = 'C:\data\personal\hinative-mp3'

But if I copy the path from Explorer and there are Hebrew letters in it,

$DestinationDirMP3 = 'C:\data\personal\עברית\cardbuilding\audio-files\hinative'

I get the "Unexpected token" error shown below:

Error:

At C:\develop\utils\powershell\aac-to-mp3-converter.ps1:19 char:44
+ ... MP3 = 'C:\data\personal\עברית\cardbuilding\audio-files\hinative'
+                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Unexpected token 'רית\cardbuilding\audio-files\hinative'

What I have tried:

  • When I use single quotes around teh string, as shown above, I get the error as shown above.
  • When I use double quotes as shown above, the error is NOT thrown, but the value in the variable is trashed (in the same way shown in the error), and downstream code uses the trashed value instead of the desired value.

What is the secret sauce I am missing?

(I should add that the file names that flow in and out of this script are full of Hebrew letters, and that all works fine. So the presence of Hebrew generally is not the issue. It seems to be specific to storing Hebrew letters in this variable that is the problem."

SOLUTION

Add the following to your settings.json file (from the command palette (Ctrl+Shift+P, type settings and select Preferences: Open Settings (JSON)):

"[powershell]": {
  "files.encoding": "utf8bom"
}

After this is done, restart VSCode. Now copy and paste your code into a NEW .ps1 file in VSCode and save it. This NEW .ps1 will have the BOM (the BOM does not seem to get added on saving existing .ps1 files). In my case, I did these steps, deleted the original .ps1 and copied my newly created .ps1 to the original .ps1 filename in order to commit to a git repo. (Perhaps I could have saved the new file over the original, but I wanted to see side by side with and without the BOM using a hex editor.)

I should further add that when I take the problem ps1 file (e.g. the file without the BOM), and simply save it under a new name, my observation is that VSCode does NOT add the BOM. The only workflow I have found to get VSCode to add the BOM is to actually open a NEW empty file (ctrl-N), paste the code into it, and save it.

More details are here.

Jonesome Reinstate Monica
  • 6,618
  • 11
  • 65
  • 112
  • Does this answer your question? [UTF-8 output from PowerShell](https://stackoverflow.com/questions/22349139/utf-8-output-from-powershell) – voiarn Dec 27 '21 at 19:59
  • @voiarn I do not think so? That reference is really long. I do not see (so far) how it allows me to solve this issue (which so far is internal to the script). Also note my enhancements to the OP. Thank you! – Jonesome Reinstate Monica Dec 27 '21 at 20:24
  • See also: [Using UTF-8 Encoding (CHCP 65001) in Command Prompt / Windows Powershell (Windows 10)](https://stackoverflow.com/a/57134096/1701026). So what PowerShell version and console/ISE/terminal are you using? – iRon Dec 27 '21 at 20:33
  • Powershell 5.1 doesn't recognize utf8 no bom encoded scripts. – js2010 Dec 27 '21 at 20:46
  • @js2010 I am on Powershell 2020.6 – Jonesome Reinstate Monica Dec 27 '21 at 20:48
  • @iRon OP enhanced to answer your version question. Thanks! – Jonesome Reinstate Monica Dec 27 '21 at 20:48
  • @iRon That reference is not working so far in powershell terminal in VSCode. (However, the number of paths is so many that it isn't clear if I took the right one.) – Jonesome Reinstate Monica Dec 27 '21 at 20:53
  • Where did you get that version number? – js2010 Dec 27 '21 at 20:54
  • @js2010, it is the version number of the PowerShell VSCode extension - however, the PowerShell Integrated Console is capable of running both Windows PowerShell and PowerShell (Core) - running `$PSVersionTable` shows which edition is running. – mklement0 Dec 27 '21 at 21:00
  • @JonesomeReinstateMonica, are you saying that other _string literals_ in your script also contain Hebrew letters, but that only _one_ of them causes a problem? Note that the ability to process Hebrew characters in _outside input_ to your script is unrelated to character-encoding problems with the script file itself. If you're using Windows PowerShell, your script file must have a UTF-8 _BOM_ - see [this answer](https://stackoverflow.com/a/54790355/45375). – mklement0 Dec 27 '21 at 21:01
  • @mklement0 YES! The issue is that the .ps1 file did not have the BOM (VSCode just didn't save it that way). Followed your steps (though note, I had to save a NEW .ps1 file, resaving the existing .ps1 with the setting did NOT create the BOM). Care to post that as the answer? – Jonesome Reinstate Monica Dec 27 '21 at 21:08
  • @mklement0 Well, while it is a dupe (in that the resolution is the same), the framing of the question is, I believe, quite different. (For example, getting the error I describe in the post does not lead to the "dupe.") So I am not sure that two questions with the same answer are duplicates.... Anyway, thanks a ton! – Jonesome Reinstate Monica Dec 27 '21 at 21:16
  • My pleasure. Yes, your framing is different, but your duplicate won't go away, so future users searching for the error message will still be able to find it - and then just need one additional click. – mklement0 Dec 27 '21 at 21:37

0 Answers0