-1

I need to do some file operations in a file structure that contains special non-latin characters. Powershell crashes, when I try to use any of those characters.

For example, this doesn't work:

$TestPath = "C:\Examples\Folder_1ĀČ\"
$ExampleFileName = "Test.txt"

Copy-Item ($PSScriptRoot + "\" + $ExampleFileName)  -Destination ($TestPath) -Force

But this works:

$TestPath = "C:\Examples\Folder_1AC\"
$ExampleFileName = "Test.txt"

Copy-Item ($PSScriptRoot + "\" + $ExampleFileName)  -Destination ($TestPath) -Force

I tried debugging with

Write-Output $TestPath

And result returned in the console was:

C:\Examples\Folder_1Ä€Ä\

Is it possible to use powershell with paths containing these characters? How can I do that?

Emi
  • 296
  • 1
  • 5
  • 18
  • No problem with me. Worked like a charm. Do you get an error message ? – Civette Jul 20 '23 at 08:57
  • It returns `Copy-Item : The filename, directory name, or volume label syntax is incorrect.` If I change the string value and exclude those characters, then it works fine. Since then, I tried debugging with `Write-Output $TestPath` and got the result `C:\Examples\Folder_1Ä€Ä\` I will edit the question, to reflect this. – Emi Jul 20 '23 at 09:25
  • Sorry, I cannot reproduce your issue. All fine here – Civette Jul 20 '23 at 09:53
  • You face a [mojibake](https://en.wikipedia.org/wiki/Mojibake) case: `[System.Text.Encoding]::GetEncoding( 1257).GetString( [System.Text.Encoding]::GetEncoding( 'UTF-8').GetBytes( 'Folder_1ĀČ'))` returns `Folder_1Ä€Ä`. Please [edit] your question to improve your [mcve]. In particular, share `[System.Text.Encoding]::Default`, `[System.Console]::InputEncoding`, `[System.Console]::OutputEncoding`, and `$PSVersionTable.PSVersion`. BTW, `Ā` (U+0100, *Latin Capital Letter A With Macron*) and `Č` (U+010C, *Latin Capital Letter C With Caron*) are both *Latin* (maybe you mean *non-ASCII*?) – JosefZ Jul 20 '23 at 14:53
  • If ASCII is the basic Latin without any letter modifiers, then yes, I meant non-ASCII. As for the rest, I don't really understand, what you mean. If I did, I might have been able to solve this problem on my own. Sorry. – Emi Jul 20 '23 at 15:26
  • "*I don't really understand, what you mean.*". Please [edit] your question to improve your [mcve]. Just share required info… – JosefZ Jul 20 '23 at 20:30

1 Answers1

1

It looks as you have a hassle with your codepage in Powershell. Check and switch to UTF-8.

Have a look at this:
StackOverflow: Changing PowerShell's default output encoding to UTF-8


Update

As you write

I found out that -encoding default works for my case

Default is your system codepage.
The simplest way to display it, just execute:

chcp

=> What is your codepage?


I suppose you've a small typo in you question:

You write:        C:\Examples\Folder_1Ä€Ä\
I would expect:   C:\Examples\Folder_1ĀČ\

These are typical character code translation problems, when interpreting UTF8 character's binary encoding by a local ANSI codepage. Ā => Ä€ and Č => ÄŒ

Please note, it is of interest

  • Which codepage is using your environment?
  • Which character encoding (codepage) is using the cmdlet's text?
  • Which encoding is used by Powershell for reading the cmdlet's text and for executing it.

Seeing your info updates the logic says:

  1. As you can successfully execute the script with -encoding Default
    => Your script has been stored using your local codepage.

  2. As not using -encoding Default results in "extended" characters:
    => Powershell assumes UTF8 as encoding, and

    • converts the read binary values of the file to correct UTF8 (changing the characters ĀČ proper UTF8 coded characters)
    • but finally the converted characters binary representation is interpreted using the local ANSI codepage.
      The result is ĀČ, as the characters ĀČ are 2-byte-encoded in UTF8.

As a consequence you should take care that all your environments (also your GUI editor) and Powershell's default are set to the same codepage.

Regarding this

PowerShell is now cross-platform, via its PowerShell Core edition, whose encoding - sensibly - defaults to BOM-less UTF-8, in line with Unix-like platforms.

(citation from link above)

I would suggest to migate everything to UTF8, so -encoding Default becomes the same as -encoding UTF8.
But be sure to do brief testing of your stored file-/directory-names and content, as currently they all are written using your local ANSI codepage.

In the meantime you have to tell Powershell, by -encoding Default not to assume your cmdlet is stored using UTF8.


How do I use this encoding for other functions like Copy-Item?

By using

mycmdlet.ps1 -encoding Default

You tell Powershell to read everything with your currently used local ANSI codepage. So everything that is handled by the commands will fit to that.
Wenn something comes in or leaves the cmdlet processing (because it's read or written) the system's codepage (local ANSI) will be used and also there should be everything OK.

dodrg
  • 1,142
  • 2
  • 18
  • After some trial and error, I found out that `-encoding default` works for my case. (utf8 did not work). But that's only for outputting to a file. How do I use this encoding for other functions like Copy-Item? – Emi Jul 20 '23 at 12:10
  • Answer updated. — By the way: As my test-cmdlet of your code has been saved in `UTF8`, but my Windows-VM defaults to `local ANSI` (as yours), I had the effect exactly the other way round: `-encoding UTF8` made the trick `-encoding Default` showed your issues. Also not advising an encoding produced the issue. – dodrg Jul 20 '23 at 16:56
  • Did this satisfy your question? – dodrg Jul 26 '23 at 15:07