2

I am using the following code to remove lines from file1.txt that are in file2.txt.

powershell -Command "$(Get-Content file1.txt) | Where-Object {$_ -notIn $(Get-Content file2.txt)}"

But I'm getting an error regarding -notIn, looking for a value expression. But file2.txt does exist and is not null.

What is causing the error, and how to fix it?

2 Answers2

2

To complement LotPings' helpful answer:

  • For execution speed, do not execute Get-Content file2.txt in every loop iteration - cache its result beforehand.

  • For memory efficiency, do not collect all lines of file1.txt up front with $(...) before sending them through the pipeline.

Therefore (I'm omitting the powershell -Command "..." wrapper for brevity):

$f2 = Get-Content file2.txt; Get-Content file1.txt | Where-Object { $f2 -NotContains $_ }

Which $ are necessary, and why?

(...) and $(...) (the subexpression operator) and @(...) (the array subexpression operator) all collect the entire command's / commands' output as a whole, in an array ([System.Object[]]), except if the output comprises only 1 item, in which that item itself is returned.

(...), which can only enclose a single command or expression, is needed:

  • to clarify precedence in an expression,
  • to use a command (cmdlet / function / external utility call ) as part of an expression.
  • In a pipeline, you can use it to force collecting the enclosed command's entire output beforehand, but doing so negates the memory-throttling benefit of a pipeline.
    • That said, using something like (Get-Content file) | ... | Set-Content file enables updating a given file file "in-place" - but do note that this requires the entire contents of file to fit into memory.
  • Unless you have additional requirements as stated below, prefer (...) to $(...) and @(...).

$(...) is only needed: Tip of the hat to PetSerAl for his help.

  • to enclose statements such as if and foreach

  • to enclose multiple commands/expressions/statements

  • to embed the above inside "..." (a double-quoted string, whose contents are subject to interpolation (expansion)).
    Among the operators listed, $(...) is the only one that can (directly) be embedded inside a double-quoted string.

  • $(...) - unlike (...) - doesn't abort the entire command if a potentially terminating error occurs; e.g.:

    • 'hi' + $(1/0) and "hi$(1/0)" report the division-by-zero error, yet still print hi; by contrast, 'hi' + (1/0) does not - the division-by-zero error aborts the entire expression.
    • Note that an unconditionally terminating error even aborts any $(...)-containing expression; e.g., 'hi' + $(Throw 'err') and "hi$(Throw 'err')" both only print the error message, not also hi.

@(...) is only needed:

  • to ensure that a command's output is an array, even if only 1 item is returned; in other respects, it acts like $(), except that you cannot use it inside "..."; for a detailed discussion of @(), see this answer of mine.

  • In PSv3+, the unified handling of scalars and arrays typically makes @(...) unnecessary, however - see this answer of mine.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    Wow! I'm carefully reading every word you wrote. Thank you! – RockPaperLz- Mask it or Casket Jun 18 '17 at 02:26
  • 1
    That's about a 400% speed increase! :) – RockPaperLz- Mask it or Casket Jun 18 '17 at 02:44
  • @PetSerAl: I get what you're saying, but then it's not a true _terminating_ error: try `'hi' + $(throw "me")`, for instance. Expressions such as `1/0` and .NET method calls such as `[int]::Parse('foo')` occupy this strange middle ground between terminating and non-terminating errors. My personal preference: DO make them truly terminating errors. In the meantime: what do we call these terminating-under-some-circumstances errors? – mklement0 Jun 18 '17 at 22:58
  • I am not sure why you call `1/0` not a true terminating error, because them behave exactly as if cmdlet call to `ThrowTerminatingError` method: `Add-Type 'using System.Management.Automation; [Cmdlet("Throw", "TerminatingError")] public class ThrowTerminatingErrorCmdlet : Cmdlet { protected override void EndProcessing() { ThrowTerminatingError(new ErrorRecord(new System.Exception(), "Error", ErrorCategory.NotSpecified, null)); } }' -PassThru | Select-Object -ExpandProperty Assembly | Import-Module`. – user4003407 Jun 18 '17 at 23:31
  • And `(Write-Host Before)+(Throw-TerminatingError)+(Write-Host After); 'Next statement'` vs `(Write-Host Before)+(1/0)+(Write-Host After); 'Next statement'`, or `try { Write-Host Before; Throw-TerminatingError; Write-Host After } catch [int] { 'Error not handled so this is not printed' }; 'Next statement'` vs `try { Write-Host Before; 1/0; Write-Host After } catch [int] { 'Error not handled so this is not printed' }; 'Next statement'`. IMHO, `throw` is just a special case which is more terminating then standard terminating errors. – user4003407 Jun 18 '17 at 23:36
  • @PetSerAl: Your examples are very helpful as usual, but there must still be something I'm not getting about the distinction between the error generated by `1/0`, for instance, vs. `throw`. The documentation doesn't help. I'll open a GitHub issue. – mklement0 Jun 19 '17 at 14:41
1

For PowerShell v2 reverse the arguments and use -NotContains

powershell -Command "$(Get-Content file1.txt) | Where-Object {$(Get-Content file2.txt) -NotContains $_ }

Reference