3

I am trying to put together a script that will convert several Excel files into PDF files. I found a link to one online that works.

$path = Read-Host -Prompt 'Input Directory Path and Press Enter'
$xlFixedFormat = “Microsoft.Office.Interop.Excel.xlFixedFormatType” -as [type]
$excelFiles = Get-ChildItem -Path $path -include *.xls, *.xlsx -recurse
$objExcel = New-Object -ComObject excel.application
$objExcel.visible = $false
foreach($wb in $excelFiles)
{
    $filepath = Join-Path -Path $path -ChildPath ($wb.BaseName + “.pdf”)
    $workbook = $objExcel.workbooks.open($wb.fullname, 3)
    $workbook.Saved = $true
    “saving $filepath”
    $workbook.ExportAsFixedFormat($xlFixedFormat::xlTypePDF, $filepath)
    $objExcel.Workbooks.close()
}
$objExcel.Quit()

If I copy and paste this into a PowerShell window, the program runs as intended. However, when I attempted to make a shortcut to run the program, I get several errors (the file is saved as a .ps1).

This is the path and arguments I made when setting up the shortcut:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit -ExecutionPolicy Bypass -File C:\[File Path]

This is the error message I get:

At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:8 char:62
+  $filepath = Join-Path -Path $path -ChildPath ($wb.BaseName + “.pdf ...
+                                                              ~
You must provide a value expression following the '+' operator.
At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:8 char:63
+ ... lepath = Join-Path -Path $path -ChildPath ($wb.BaseName + “.pdfâ€)
+                                                               ~~~~~~~~~~
Unexpected token '“.pdfâ€' in expression or statement.
At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:8 char:62
+  $filepath = Join-Path -Path $path -ChildPath ($wb.BaseName + “.pdf ...
+                                                              ~
Missing closing ')' in expression.
At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:7 char:1
+ {
+ ~
Missing closing '}' in statement block or type definition.
At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:8 char:73
+ ... lepath = Join-Path -Path $path -ChildPath ($wb.BaseName + “.pdfâ€)
+                                                                         ~
Unexpected token ')' in expression or statement.
At C:\Users\cbeals.ENVIROTECH\Documents\Test\ConvertExcelToPDF.ps1:14 char:1
+ }
+ ~
Unexpected token '}' in expression or statement.
    + CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
    + FullyQualifiedErrorId : ExpectedValueExpression

Why would this be failing?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
CBeals
  • 31
  • 1
  • 3
  • 1
    What program did you use to save it as a .ps1 file? I copied it from your answer, pasted it into Notepad and saved it and it ran just fine. The error message indicates it's not interpreting the quotes around ".pdf" correctly, which seems like it would be a file encoding issue of some sort. Try copying from your answer here, opening notepad, pasting it and saving it again. – kpogue Mar 07 '19 at 17:41
  • 3
    You are using typographic quotes above, change to normal single `'` or double `"` quotes. –  Mar 07 '19 at 17:44
  • 1
    @kpogue, this fixed it. I had initally used Notepad++ to save it. As you suggested, I saved it as a .ps1 using standard notepad and it worked. Thank you. – CBeals Mar 07 '19 at 17:58

1 Answers1

6

To clarify:

  • It is perfectly fine to use Unicode (non-ASCII-range) quotation marks such as in PowerShell - see the bottom section.

  • However, in order to use such characters in script files, these files must use a Unicode character encoding such as UTF-8 or UTF-16LE ("Unicode").

Your problem was that your script file was saved as UTF-8 without a BOM, which causes Windows PowerShell (but not PowerShell (Core) 7+) to misinterpret it, because it defaults to "ANSI" encoding, i.e., the single-byte legacy encoding associated with the legacy system locale a.k.a language for non-Unicode programs (e.g., Windows-1252 in the US and Western Europe), which PowerShell calls Default.

While replacing the Unicode quotation marks with their ASCII counterparts solves the immediate problem, any other non-ASCII-range characters in the script would continue to be misinterpreted.

  • The proper solution is to re-save the file as UTF-8 with BOM.

  • It is a good habit to form to routinely save all PowerShell scripts (source code) as UTF-8 with BOM, because that ensures that they're interpreted the same, irrespective of any given computer's system locale (culture), and irrespective of what PowerShell edition you use.

    • See this related answer, which shows how to configure Visual Studio Code to create new PowerShell files as UTF-8 with BOM.

To demonstrate the specific problem:

  • , the LEFT DOUBLE QUOTATION MARK (U+201C) Unicode character, is encoded as three bytes in UTF-8 format: 0xE2 0x80 0x9C.

  • You can verify this via the output from '“' | Format-Hex -Encoding Utf8 (only the byte sequence matters here; the printed characters. on the right are not representative in this case).

  • When Windows PowerShell reads this sequence as "ANSI"-encoded, it considers each byte a character in its own right, which is why you saw three characters for the single in your output, namely “.

  • You can verify this with [Text.Encoding]::Default.GetString([byte[]] (0xE2, 0x80, 0x9C)) (from PowerShell Core, use [Text.Encoding]::GetEncoding([cultureinfo]::CurrentCulture.TextInfo.ANSICodePage).GetString([byte[]] (0xE2, 0x80, 0x9C))).


Interchangeable use of ASCII-range and Unicode quotation marks, dashes, and whitespace in PowerShell:

In a properly encoded input file, PowerShell allows interchangeable use of the following quotation and punctuation characters; e.g., "hi", ”hi” and even "hi„ are equivalent.

Note:

  • Important: The above describes interchangeable syntactic use of these characters; if you use such characters in identifiers (which you shouldn't) or in strings[1], they are not treated the same.

  • The above was in part gleaned from the source code on GitHub (Functions Is... in file CharTraits.cs, also see above for SpecialChars definition).


[1] There are limited exceptions: given that PowerShell's -eq operator compares string using the invariant culture rather than performing ordinal comparison, space-character variations may be treated the same in string comparisons, depending on the host platform; e.g., "foo bar" -eq "foo`u{a0}bar" yields $true on macOS and Linux (but not Windows!), because the regular ASCII-range space is considered equal to the no-break space (U+00A0) there.

mklement0
  • 382,024
  • 64
  • 607
  • 775