Unquoted tokens in argument mode involving variable references and subexpressions: why are they sometimes split into multiple arguments?

Question

^{Note: A summary of this question has since been posted at the PowerShell GitHub repository, since superseded by this more comprehensive issue.}

Arguments passed to a command in PowerShell are parsed in argument mode (as opposed to expression mode - see Get-Help about_Parsing).

Conveniently, (double-)quoting arguments that do not contain whitespace or metacharacters is usually optional, even when these arguments involve variable references (e.g. $HOME\sub) or subexpressions (e.g., version=$($PsVersionTable.PsVersion).

For the most part, such unquoted arguments are treated as if they were double-quoted strings, and the usual string-interpolation rules apply (except that metacharacters such as , need escaping).

I've tried to summarize the parsing rules for unquoted tokens in argument mode in this answer, but there are curious edge cases:

Specifically (as of Windows PowerShell v5.1), why is the unquoted argument token in each of the following commands NOT recognized as a single, expandable string, and results in 2 arguments getting passed (with the variable reference / subexpression retaining its type)?

$(...) at the start of a token:
```
Write-Output $(Get-Date)/today # -> 2 arguments: [datetime] obj. and string '/today'
```
- Note that the following work as expected:
  - Write-Output $HOME/sub - simple var. reference at the start
  - Write-Output today/$(Get-Date) - subexpression not at the start

.$ at the start of a token:
```
Write-Output .$HOME  # -> 2 arguments: string '.' and value of $HOME
```
- Note that the following work as expected:
  - Write-Output /$HOME - different initial char. preceding $
  - Write-Output .-$HOME - initial . not directly followed by $
  - Write-Output a.$HOME - . is not the initial char.

As an aside: As of PowerShell Core v6.0.0-alpha.15, a = following a simple var. reference at the start of a token also seems to break the token into 2 arguments, which does not happen in Windows PowerShell v5.1; e.g., Write-Output $HOME=dir.

Note:

I'm primarily looking for a design rationale for the described behavior, or, as the case may be, confirmation that it is a bug. If it's not a bug, I want something to help me conceptualize the behavior, so I can remember it and avoid its pitfalls.
All these edge cases can be avoided with explicit double-quoting, which, given the non-obvious behavior above, may be the safest choice to use routinely.

Optional reading: The state of the documentation and design musings

As of this writing, the v5.1 Get-Help about_Parsing page:

incompletely describes the rules
uses terms that aren't neither defined in the topic nor generally in common use in the world of PowerShell ("expandable string", "value expression" - though one can guess their meaning)

From the linked page (emphasis added):

In argument mode, each value is treated as an expandable string unless it begins with one of the following special characters: dollar sign ($), at sign (@), single quotation mark ('), double quotation mark ("), or an opening parenthesis (().

If preceded by one of these characters, the value is treated as a value expression.

^{As an aside: A token that starts with " is, of course, by definition, also an expandable string (interpolating string).

Curiously, the conceptual help topic about quoting, Get-Help about_Quoting_Rules, manages to avoid both the terms "expand" and "interpolate".}

Note how the passage does not state what happens when (non-meta)characters directly follow a token that starts with these special characters, notably $.

However, the page contains an example that shows that a token that starts with a variable reference is interpreted as an expandable string too:

With $a containing 4, Write-Output $a/H evaluates to (single string argument) 4/H.

Note that the passage does imply that variable references / subexpressions in the interior of an unquoted token (that doesn't start with a special char.) are expanded as if inside a double-quoted string ("treated as an expandable string").

If these work:

$a = 4
Write-Output $a/H         # -> '4/H'
Write-Output H/$a         # -> 'H/4'
Write-Output H/$(2 + 2)   # -> 'H/4'

why shouldn't Write-Output $(2 + 2)/H expand to '4/H' too (instead of being treated as 2 arguments?
Why is a subexpression at the start treated differently than a variable reference?

Such subtle distinctions are hard to remember, especially in the absence of a justification.

A rule that would make more sense to me is to unconditionally treat a token that starts with $ and has additional characters following the variable reference / subexpression as an expandable string as well.
(By contrast, it makes sense for a standalone variable reference / subexpression to retain its type, as it does now.)

Note that the case of a token that starts with .$ getting split into 2 arguments is not covered in the help topic at all.

Even more optional reading: following a token that starts with one of the other special characters with additional characters.

Among the other special token-starting characters, the following unconditionally treat any characters that follow the end of the construct as a separate argument (which makes sense):
( ' "

Write-Output (2 + 2)/H   # -> 2 arguments: 4 and '/H'
Write-Output "2 + $a"/H  # -> 2 arguments: '2 + 4' and '/H', assuming $a equals 4
Write-Output '2 + 2'/H   # -> 2 arguments: '2 + 2' and '/H'

^{As an aside: This shows that bash-style string concatenation - placing any mix of quoted and unquoted tokens right next to each other - is not generally supported in PowerShell; it only works if the 1st substring / variable reference happens to be unquoted. E.g., Write-Output H/'2 + 2', unlike the substrings-reversed example above, produces only a single argument.}

The exception is @: while @ does have special meaning (see Get-Help about_Splatting) when followed by just a syntactically valid variable name (e.g., @parms), anything else causes the token to be treated as an expandable string again:

Write-Output @parms    # splatting (results in no arguments if $parms is undefined)

Write-Output @parms$a  # *expandable string*: '@parms4', if $a equals 4

isn't powershell open source, technically you can look at the source and determine the root cause? :) — 4c74356b41, Feb 07 '17 at 21:27
I've always found the actual parsing of `$var/string` as one expandable string to be contrary to what the documentation say, since the expression starts with `$`. `$(date)/sub` as two distinct arguments (a value expression and a bareword string) makes perfect sense — Mathias R. Jessen, Feb 07 '17 at 21:48
@4c74356b41: Fair point, please tell us what you find :) But seriously: I'm looking for a _design rationale_, which the source code typically won't tell you. — mklement0, Feb 08 '17 at 13:14
@MathiasR.Jessen: Please see my update, which makes the point that `$var/string` being treated as an expandable string as well does not inherently contradict the - inadequate - documentation, and that there's no good reason to treat something like `$(Get-Date)/sub` (a subexpression rather than a variable reference) differently. — mklement0, Feb 12 '17 at 23:15
`As an aside: This shows that bash-style string concatenation - placing any mix of quoted and unquoted tokens right next to each other - is not supported in PowerShell.` Unfortunately, this is supported. I see people use it sometimes and it always throws me off. `Trace-Command -Name ParameterBinding -PSHost -Expression { Write-Host -Object Hello'Hi'sup"whatup" }` — briantist, May 24 '17 at 02:56
@mklement0 a design rationale may very well become crystal clear after viewing source code. Or perhaps it will be completely opaque. There is no way to know until you look! — briantist, May 24 '17 at 02:59
@briantist: Thanks for the `Hello'Hi'sup"whatup"` example - didn't know that worked, but, as it turns out, it only works if the 1st substring is _unquoted_ (try `Write-Host -Object 'Hello'sup"whatup"`), which is exactly the kind of obscure behavior that prompted this question. — mklement0, May 24 '17 at 12:23
@briantist: The fact that digging into the source code may turn out to be an exercise in futility is what kept me from attempting it (that and the fact that I'm sure it's a complex piece of code). A member of the PowerShell team has since provided an answer [here](https://github.com/PowerShell/PowerShell/issues/3217#issuecomment-303579931), which I'm still digesting. — mklement0, May 24 '17 at 12:46
@mklement0 I feel you on that. Following the code on github without stepping through a debugger can be maddening. Glad to see there's someone on the dev team engaged in this question! — briantist, May 24 '17 at 13:53
I have noticed the same and gotten into the habit that if I want to ensure all my output is a solid string I encapsulate all in double-quotes. Write-Output "$(Get-Date)/today" — Parrish, Nov 09 '17 at 18:06
what I'm getting from these examples working as expected: `Write-Output $HOME/sub` `Write-Output today/$(Get-Date)` and this not working as expected: `Write-Output $(Get-Date)/today` is that powershell tries to convert the second part to the same type as the first part: `$HOME` is already a string so there is no problem to concatenate it with `/sub`. `today/` is being treated as string therefore `$(Get-Date)` gets converted .tostring() however in the last example powershell is unable to convert `/today` to a timedate type. — Aurimas Stands with Ukraine, Apr 05 '18 at 21:09
@AurimasN: The breaking in two is unrelated to whether the initial `$(...)` expression is a string or not, as the following example demonstrates: `Write-Output $('home')/sub` - even though `$('home')` is of type string, `/sub` is passed as a separate argument. — mklement0, Apr 05 '18 at 22:09
Note the questionable difference: `$x=Get-Date; Write-Output $x/today` versus similar `Write-Output $(Get-Date)/today` — JosefZ, Jul 05 '18 at 18:20

score 1 · Answer 1 · answered Jul 25 '18 at 18:32

1

I think what you're sort of hitting here is more the the type "hinting" than anything else.

You're using Write-Output which specifies in it's Synopsis that it

Sends the specified objects to the next command in the pipeline.

This command is designed to take in an array. When it hits the first item as a string like today/ it treats it like a string. When the first item ends up being the result of a function call, that may or may not be a string, so it starts up an array.

It's telling that if you run the same command to Write-Host (which is designed to take in a string to output) it works as you'd expect it to:

 Write-Host $(Get-Date)/today

Outputs

7/25/2018 1:30:43 PM /today

So I think you're edge cases you're running up against are less about the parsing, and mor about the typing that powershell uses (and tries to hide).

answered Jul 25 '18 at 18:32

SamuelWarren

1,449
13
28

Thanks, but `Write-Host` too parses the `$(Get-Date)/today` as 2 distinct arguments - it's just less obvious, because it uses a _space_ to separate the elements in the output, whereas `Write-Output` outputs each element on its own line. In other words: the parsing is consistent, and in neither case I'd expect the token to be treated as _2_ arguments, given that simply _swapping_ the tokens involved is treated as _1_ argument: `Write-Output today/$(Get-Date)` – mklement0 Jul 25 '18 at 18:40
As an aside: the 2 arguments that result from `$(Get-Date)/today` are not _passed_ as an array; they are passed as individual positional arguments that only end up in an array because `Write-Output` and `Write-Host` declare their `-InputObject` / `-Object` parameter with the special `ValueFromRemainingArguments` parameter-attribute flag, which causes the individually passed arguments to be _implicitly collected_ in an array. Due to this flag, in effect `Write-Output 1, 2` - passing an _array_ - is then the same as `Write-Output 1 2` - passing individual arguments. – mklement0 Jul 25 '18 at 18:52

Unquoted tokens in argument mode involving variable references and subexpressions: why are they sometimes split into multiple arguments?

Optional reading: The state of the documentation and design musings

Even more optional reading: following a token that starts with one of the other special characters with additional characters.

1 Answers1

Linked