1

I got a Script that works, but since we have coworker which have ö,ü,ä in their names the csv resolves them into ? (Example: Hörnlima = H?rnlima). Because of this it doesn't give me back any SamAccountname and the list isn't correct anymore. How can I correct that?

Script:

Import-Csv D:\Files\PowerShell\Test\4ME\DisplaynameToSamAccountName\Displaynames.txt | ForEach {
   Get-ADUser -Filter "DisplayName -eq '$($_.DisplayName)'" -Properties Name, SamAccountName | 
Select Name,SamAccountName
} | Export-CSV -path D:\Files\PowerShell\Test\4ME\DisplaynameToSamAccountName\Accountnames.csv -NoTypeInformation

Any ideas appreciated.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Naimad
  • 39
  • 5
  • 2
    Make sure your CSV is saved in UTF8 encoding. Then use `Import-Csv -Path .. -Encoding UTF8`. Save with `Export-Csv -Path .. -Encoding UTF8` – Theo Oct 08 '20 at 12:54

1 Answers1

2

tl;dr:

Use, e.g., Export-Csv -Encoding utf8 ... to save your file with UTF-8 character encoding, which ensures that accented characters such as ö are preserved.


In Windows PowerShell, Export-Csv regrettably defaults to ASCII encoding, which means that any characters outside the US-ASCII range - notably accented characters such as ö - are transliterated to literal ?.

That is, such characters are lost, because they cannot be represented in ASCII encoding.

In PowerShell [Core] v6+, all cmdlets, including Export-Csv, now thankfully default to BOM-less UTF-8 encoding.

As for the behavior when you append to a preexisting CSV file with the -Append switch without specifying -Encoding, see this answer.


Therefore, especially in Windows PowerShell, use the -Encoding parameter to specify the desired character encoding:

  • -Encoding utf8 is advisable, because it is capable of encoding all Unicode characters.

    • In Windows PowerShell, the resulting file will invariably have a BOM.
    • In PowerShell [Core] v6+, it will be BOM-less, which is generally better for cross-platform compatibility, but you can alternatively use -Encoding utf8BOM to use a BOM.
  • -Encoding Unicode (UTF-16LE) encoding is another option, but results in larger files (most characters are encodes by 2 bytes). This encoding always results in a BOM.

  • -Encoding Default (Windows PowerShell) or
    -Encoding (Get-Culture).TextInfo.ANSICodePage (PowerShell [Core] v6+) on Windows uses your system's active ANSI code page to create a BOM-less file.

    • This legacy encoding is best avoided, however, for multiple reasons:
      • Many modern applications assume UTF-8 encoding in the absence of a BOM.

      • Even those that read the file as ANSI-encoded may interpret a file differently if the host system happens to have a different active ANSI page.

      • Since the active ANSI code page is (for Western cultures) a fixed, single-byte encoding, only 256 characters can be represented, which is only a small subset of all Unicode characters.


Note that when PowerShell reads a file that is BOM-less, including source code, the behavior differs between the two editions:

  • In Windows PowerShell, Default is assumed, i.e. the system's active ANSI code page.

    • Note that in recent versions of Windows 10 it is now possible to make UTF-8 the ANSI code page, but such a system-wide change can have unintended consequences - see this answer.
  • In PowerShell [Core] v6+, UTF-8 is assumed.

mklement0
  • 382,024
  • 64
  • 607
  • 775