tl;dr:
Use, e.g., Export-Csv -Encoding utf8 ...
to save your file with UTF-8 character encoding, which ensures that accented characters such as ö
are preserved.
In Windows PowerShell, Export-Csv
regrettably defaults to ASCII encoding, which means that any characters outside the US-ASCII range - notably accented characters such as ö
- are transliterated to literal ?
.
That is, such characters are lost, because they cannot be represented in ASCII encoding.
In PowerShell [Core] v6+, all cmdlets, including Export-Csv
, now thankfully default to BOM-less UTF-8 encoding.
As for the behavior when you append to a preexisting CSV file with the -Append
switch without specifying -Encoding
, see this answer.
Therefore, especially in Windows PowerShell, use the -Encoding
parameter to specify the desired character encoding:
-Encoding utf8
is advisable, because it is capable of encoding all Unicode characters.
- In Windows PowerShell, the resulting file will invariably have a BOM.
- In PowerShell [Core] v6+, it will be BOM-less, which is generally better for cross-platform compatibility, but you can alternatively use
-Encoding utf8BOM
to use a BOM.
-Encoding Unicode
(UTF-16LE) encoding is another option, but results in larger files (most characters are encodes by 2 bytes). This encoding always results in a BOM.
-Encoding Default
(Windows PowerShell) or
-Encoding (Get-Culture).TextInfo.ANSICodePage
(PowerShell [Core] v6+) on Windows uses your system's active ANSI code page to create a BOM-less file.
- This legacy encoding is best avoided, however, for multiple reasons:
Many modern applications assume UTF-8 encoding in the absence of a BOM.
Even those that read the file as ANSI-encoded may interpret a file differently if the host system happens to have a different active ANSI page.
Since the active ANSI code page is (for Western cultures) a fixed, single-byte encoding, only 256 characters can be represented, which is only a small subset of all Unicode characters.
Note that when PowerShell reads a file that is BOM-less, including source code, the behavior differs between the two editions:
In Windows PowerShell, Default
is assumed, i.e. the system's active ANSI code page.
- Note that in recent versions of Windows 10 it is now possible to make UTF-8 the ANSI code page, but such a system-wide change can have unintended consequences - see this answer.
In PowerShell [Core] v6+, UTF-8 is assumed.