2

I'm relatively new to using Invoke-RestMethod /Invoke-WebRequest in general- I think that's important context for the below question.

I'm making a call like the below:

$Headers = @{
    "Authorization" = "Bearer {0}" -f ($token)
}

$APIUri = "https://myendpoint/information/blah"

$parameters = @{
    '$expand' = 'someparams'
    '$filter' = "moreparams"
}

$result = Invoke-RestMethod -Headers $headers -Uri $APIUri  -Method Get -Body $parameters

But when checking type of the returned object $result.GetType():

IsPublic IsSerial Name                                     BaseType                                                                                                                                                                                              
-------- -------- ----                                     --------                                                                                                                                                                                              
True     True     String                                   System.Object

I end up with a string, rather than a PSObject or JSON. This means I can't leverage this data without doing some crazy string manipulation to get it into a format I can use.

Now, if I simply output the results to a CSV file like so, the data parses correctly:

Invoke-RestMethod -Headers $headers -Uri $APIUri  -Method Get -Body $parameters -OutFile "mylocation.csv" 

I could push to a csv, then import it back via Import-CSV. But I'd rather avoid having to take that extra step since the data should be available already, and I have to make a lot of calls, which would return many files.

Any idea what might be going on here? I found a Reddit post that mentioned if the conversion from JSON or XML fails, it silently returns a string object. But not sure why that would be happening in this case.

EDIT: Almost immediately after posting this I'm realizing that the returned value has unexpected (crazy) characters prepended:

Column1,Column2,Column3
Data1,Data2,Data3
Data1,Data2,Data3

So I'm guessing that those characters are somehow killing the conversion (would also make sense while my other efforts to correct using ConvertTo-xxx and ConvertFrom-xxx were getting killed as well). But somehow using the -Outfile switch is overcoming those character results.

But I'm curious why those chars wouldn't also throw an issue for the Outfile switch?

I can remove those few characters via .Substring(), is there any easy way to convert this to an object at this point after having removed the characters?

I've built around this scenario using something like the below since there are at least newLines built in to the result:

$properResult = $result.RawContent.Split([Environment]::NewLine, [System.StringSplitOptions]::RemoveEmptyEntries)

foreach ($line in $properResult)
$lineObject = [PSCustomObject]@{
            Column1 = $line[0]
            Column2 = $line[1]
            Column3 = $line[2]
        }

But it's kind of ugly. Any help would be appreciated.

Johnny Welker
  • 104
  • 1
  • 7

2 Answers2

3

[1] is the result of a mis-decoded UTF-8 BOM (Unicode signature):

  • Your web service is sending UTF-8 data and precedes it with a BOM (which is not typical; BOMs are more typically used in files), instead of indicating the character encoding via a charset attribute in the Content-Type response header field.

  • In Windows PowerShell and up to PowerShell (Core) 7.3.x:

    • PowerShell does not honor a BOM in response data:

    • In the absence of a charset attribute in the Content-Type response header field it defaults to ISO-8859-1, a near-complete subset of the Windows-1252 encoding, which serves as the ANSI code page in English and Western European locales (cultures).

      • This will change to UTF-8 in 7.4+ (and is already implemented in its preview versions) - see GitHub PR #18219.
  • The automatic parsing into objects that Invoke-RestMethod performs is limited to:

    • XML media types:

      • application/xml is parsed into a single [xml] instance.
      • application/rss+xml and application/atom+xml are parsed into a collection of XmlElement instances
    • JSON (application/json):

    • Any other media types, including CSV data (text/csv) are simply returned as string data ([string]) as of PowerShell 7.3.3, and I'm not aware of plans to add automatic parsing support for additional media types.


The upshot is:

  • Your problem will likely go away in PowerShell v7.4+

  • If you have control over the web service, you can fix the problem in earlier versions too, by making the responses include a charset=utf-8 attribute in the Content-Type header field; e.g., text/plain; charset=utf-8, and sending the data without a BOM.

  • Otherwise, you need workarounds:

    • Using -OutFile to write the data to a (temporary) file is a viable workaround:

      • PowerShell writes the raw bytes to a file, and the BOM is honored by PowerShell when it reads from a file.
    • An in-memory workaround requires more work: You must use Invoke-WebRequest instead of Invoke-RestMethod, which gives you access to the raw bytes of the response data, which you can then manually decode as UTF-8 and then parse with ConvertFrom-Csv:

# ...

# Note: Use Invoke-*WebRequest*
$result = Invoke-WebRequest -Headers $headers -Uri $APIUri  -Method Get -Body $parameters

# Decode the raw result data as UTF-8 and skip the BOM with .Substring(1)
# (A *decoded* BOM is a *single* Unicode character; if your response
#  truly has *two* BOMs, use .Substring(2)
$stringDecodedAsUtf8 =
  [Text.Encoding]::UTF8.GetString(
    $result.RawContentStream.ToArray()
  ).Substring(1)

# Now you can parse the correctly decoded string data as CSV.
$stringDecodedAsUtf8 | ConvertFrom-CsvData

[1] If this sequence really appears twice at the start of your data, even with correct interpretation as BOM-prefixed UTF-8 you'd end up with a - normally invisible - U+FEFF) character(ZERO WIDTH NO-BREAK SPACE) at the start of the decoded data.
In fact, a BOM byte sequence is the encoding of that character in the encoding being indicated. That is, while it is a character in its own right, it also serves as the BOM, but only at the very start of the data.
, specifically, is the result of the following mis-decoding of U+FEFF:
[Text.Encoding]::GetEncoding('iso-8859-1').GetString([Text.Encoding]::UTF8.GetBytes([char] 0xFEFF))

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Fantastic information from the change in PowerShell 7.4 to how to properly decode that raw data. It did in fact have two BOMs, odd. A question on one of your other points, when you say a temporary file (CSV), do you mean just writing and then deleting the file afterwards? That would work as well I suppose. – Johnny Welker Apr 10 '23 at 14:24
  • 1
    Glad to hear it helped, @JohnnyWelker. Re CSV file: yes, that's what I meant. However, if you have a double BOM, you'll still need to manually remove the extra `U+FEFF` you'll therefore get as the string's first character. – mklement0 Apr 10 '23 at 14:30
2

Those  characters represent UTF-8 with BOM data improperly encoded/decoded as UTF-8. You'll want to fix that issue rather than attempt to strip them out to avoid other encoding problems. If they don't appear when viewing the result outfile in notepad, then you can mess with the encoding to see what is required

If you cannot change how your server provides the data, you can probably use the -ContentType parameter of Invoke-RestMethod to specify how you decode it like so:

$UTF8BOM = Invoke-RestMethod -ContentType "text/csv; charset=windows-1252" @OtherParams
# or no BOM
$UTF8    = Invoke-RestMethod -ContentType "text/csv; charset=utf-8" @OtherParams

Using type text/csv might get powershell to convert your data to objects (I don't have one to test with), but I'm pretty sure it only does that for actual json/xml content. You should still be able to do something really simple like $csv = $result | ConvertFrom-Csv

Cpt.Whale
  • 4,784
  • 1
  • 10
  • 16
  • 1
    Good point about the BOM, but `-ContentType` does _not_ help, as it only applies to data sent _to_ the server. Also, as you suspected, `Invoke-RestMethod` unfortunately does _not_ automatically parse `text/csv` data (as of this writing, but I'm not aware of plans to change that) - only various XML media types and `application/json`; everything else falls back to returning a `[string]`. – mklement0 Apr 07 '23 at 19:58