2

Notepad++ says the CSV file is Ansi encoded.

The Powershell 7 Import-CSV commandlet has various -Encoding options but 'Ansi' is not one of them.

How do I get Powershell to read this CSV without mangling it?

The options for -Encoding are:

  • ascii
  • bigendianunicode
  • bigendianutf32
  • oem
  • unicode
  • utf7
  • utf8
  • utf8BOM
  • utf8NoBOM
  • utf32
codeulike
  • 22,514
  • 29
  • 120
  • 167
  • 2
    Did you... try... any of them? Lower half of character mappings in `ANSI` is the same as `ASCII`, give that a try :) – Mathias R. Jessen Oct 18 '22 at 12:05
  • If I use ASCII it mangles all the special characters/diacritics in the data. Ascii and Ansi are not the same encoding – codeulike Oct 18 '22 at 12:13
  • *Did you... try... any of them?* yes I did, no need to be snarky – codeulike Oct 18 '22 at 12:15
  • I didn't mean to be, I was genuinely curious - your original post contains no mention of diacritics or special characters, and no mention of what you've tried :) – Mathias R. Jessen Oct 18 '22 at 12:17
  • 1
    Thats why encoding is complicated, people who live in ASCII-alphabet countries dont have to think about it until they hit a CSV file full of international names and addresses. Turns out Powershell is missing an Ansi option. – codeulike Oct 18 '22 at 12:21
  • 1
    *no mention of what you've tried* If I'd written "I tried all the other encodings that aren't ansi and none of them were able to read an ansi file" I would have just looked even dumber – codeulike Oct 18 '22 at 12:44
  • It might be utf8 if it's utf8nobom – js2010 Oct 18 '22 at 14:13

1 Answers1

4

To use ANSI encoding, i.e. the specific code page implied by the active legacy system locale (language for non-Unicode programs), such as Windows-1252:

  • in Windows PowerShell:

    -Encoding Default
    
  • in PowerShell (Core) 7+, which you're using, Default now refers to UTF-8, so more work is needed:

    • In PowerShell v7.3-:

      -Encoding ([cultureinfo]::CurrentCulture.TextInfo.ANSICodePage)
      
    • In PowerShell v7.4+:[1]

      -Encoding Ansi
      

Default character encodings in the two PowerShell editions:

  • Windows PowerShell, the legacy, Windows-only, ships-with-Windows edition (whose latest and last version is v5.1.x), defaults to the active ANSI code page in key areas - notably Get-Content / Set-Content and when the PowerShell engine reads source code - but the defaults vary widely across the built-in cmdlets; case in point: Import-Csv defaults to UTF-8; see the bottom section of this answer for an overview.

  • PowerShell (Core), the modern, install-on-demand, cross-platform edition (which started with v6 and is currently at v7.2.x), now fortunately consistently defaults to (BOM-less) UTF-8.


[1] The absence of an Ansi -Encoding value in earlier PowerShell (Core) versions was a curious omission, given that an Oem value (for the active OEM code page) has always existed - see GitHub issue #6562 for the backstory.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • So the default used to be Ansi, they (sensibly) changed it to UTF-8 but then forgot they needed an Ansi option. – codeulike Oct 18 '22 at 12:27
  • Unfortunately yes, @codeulike. Please see my update re default encodings in the two PowerShell editions. – mklement0 Oct 18 '22 at 13:03
  • 1
    Good new, @codeulike: `-Encoding Ansi` will work in the upcoming 7.4 version, which is already available in a preview version. – mklement0 Apr 06 '23 at 08:38