5

I have the following unanswered questions, and I am looking for documentation that explains the PowerShell core requirement for LF vs CRLF line ending in Linux environments.

1- Can PWSH Core handle files with CRLF?

2- When I run a ps1 file with an LF line ending, can it call another ps1 file with mixed CRLF line endings?

3- Is the PWSH line ending requirement documented and consistent across all Linux distributions?

I am asking the above questions since PowerShell is born in Windows environment and I expect it can somehow tolerate LF vs. CRLF discrepancies.

A link to online documentation would be a great help. I did search and surprisingly, I could not find any.

Allan Xu
  • 7,998
  • 11
  • 51
  • 122
  • 2
    https://www.oreilly.com/library/view/mastering-windows-powershell/9781787126305/54b34ba6-f5e8-4649-894f-c46f42121893.xhtml – Ben Voigt Jan 13 '21 at 19:11
  • 7
    Summary of above: Powershell itself can tolerate both line ending styles or any mix, but if you `chmod +x` the script, the shebang line which the kernel reads to determine which interpreter to spawn has to be plain LF. – Ben Voigt Jan 13 '21 at 19:12
  • 1
    @BenVoigt that would be great as an answer! – briantist Jan 13 '21 at 20:20
  • @briantist: But it isn't the official documentation that OP has asked for, nor am I an expert on the topic. – Ben Voigt Jan 13 '21 at 20:50
  • 1
    @BenVoigt that's perfectly ok! Other people can write answers too if they have more to share. The main point is that your answer is _useful_, which, I mean, I feel silly having just realized you have 237k+ rep so I'm basically preaching to the choir ;-) – briantist Jan 13 '21 at 21:02
  • 1
    to further state the point: 99% of my rep comes from PowerShell questions, whether or not that makes me an expert is debatable; I knew that either line endings worked on any platform; I didn't know the LF was needed for shebang, so TIL – briantist Jan 13 '21 at 21:03

3 Answers3

1

Just preserving the content of the link mentioned in the comments (source):

Line endings

Windows editors, including ISE, tend to use a carriage return followed by linefeed (\r\n or `r`n) at the end of each line. Linux editors use linefeed only (\n or `n).

Line endings are less important if the only thing reading the file is PowerShell (on any platform). However, if a script is set to executable on Linux, a shebang must be included and the line-ending character used must be linefeed only.

For example, a created as follows named test.ps1 must use \n to end lines:

#!/usr/bin/env powershell 
Get-Process 

The first line is the shebang and lets Linux know which parser to use when executing the shell script.

Once created, chmod may be used to make the script executable outside of PowerShell:

chmod +x test.ps1

According to this, PowerShell can handle both - CRLF and LF - even in the same file. The only exception is the shebang: it has to end with LF.

Start-Automating
  • 8,067
  • 2
  • 28
  • 47
stackprotector
  • 10,498
  • 4
  • 35
  • 64
0

stackprotector's community wiki answer, inspired by a link provided by Ben Voigt, provides the gist of an answer to your questions.

Let me complement it by answering your questions one by one:

I am looking for documentation that explains the PowerShell core requirement for LF vs CRLF line ending in Linux environments.

As of this writing, there appears to be no such documentation:

  • The conceptual about_Special_Characters help topic discusses `r (CR) and `n (LF) separately, but not explicitly in terms of their potential function as newline characters / sequences (line breaks; the topic uses the term "new line" to refer to a LF character, specifically).

1- Can PWSH Core handle files with CRLF?

Yes - both PowerShell editions - the legacy, ships-with-Windows, Windows-only Windows PowerShell edition (whose latest and final version is v5.1), as well as the cross-platform, install-on-demand, PowerShell (Core) edition (v6+) - treat CRLF and LF newlines interchangeably - both with respect to reading source code and reading files[1] - even if the two newline formats are mixed in a single file.
While not documented as such, PowerShell has always worked this way, and, given its commitment to backward compatibility, this won't change (which in this case is definitely a blessing, given that PowerShell (Core) is now cross-platform and must be able to handle files with LF-only newlines on Unix-like platforms).

The following examples demonstrate this - they work the same on all supported platforms ("`n" creates a LF-only newline, "`r`n" a CRLF newline):

# Read a file with mixed newline formats.
PS> "one`ntwo`r`nthree" > temp.txt; (Get-Content temp.txt).Count; Remove-Item temp.txt
3

# Execute a script with mixed newline formats.
PS> "'one'`n'two'`r`n'three'" > temp.ps1; . ./temp.ps1; Remove-Item temp.ps1
one
two
three

2- When I run a ps1 file with an LF line ending, can it call another ps1 file with mixed CRLF line endings?

Yes - this follows from the above.
However, special considerations apply to shebang-line-based PowerShells scripts on Unix-like platforms:

  • Such stand-alone shell scripts - which needn't an arguably shouldn't have a .ps1 extension - are first read by the system on Unix-like platforms, and therefore require the shebang line - by definition the first line - to be terminated with a LF-only ("`n") newline, given that only LF by itself is considered a newline on Unix-like platforms.[2] All remaining lines are then read only by PowerShell, and any mix of CLRF and LF is then accepted, as usual; e.g.:

    # Run on any Unix-like platform - note that `n alone must end the first line.
    PS> "#!/usr/bin/env pwsh`n'one'`r`n'two'" > temp; chmod a+x temp; ./temp; Remove-Item temp
    one
    two
    
  • In practice, not least due to the not insignificant startup cost of pwsh, the PowerShell (Core) CLI, but also due to several bugs as of PowerShell 7.2.6, stand-alone shebang-line-based PowerShell scripts - which are primarily useful for being called from outside PowerShell - are rare.

3- Is the PWSH line ending requirement documented and consistent across all Linux distributions?

No, it isn't documented.
Yes, as implied by the above, it is consistent, not just across Linux distributions, but across all supported platforms.


[1] Even CR-only newlines - as used in long-obsolete legacy mac OS versions, which should therefore be avoided nowadays - are recognized in PowerShell source code and by Get-Content, but not by Measure-Object -Line, for instance.

[2] If the first line ends in CRLF, the CR (\r) is retained as part of the line and therefore as part of the target executable path or the last option passed to it, which breaks the invocation - see this answer for a real-life manifestation of this problem.

mklement0
  • 382,024
  • 64
  • 607
  • 775
0

Being as cannonical as a I can be:

I worked on PowerShell v2-v3.

Newlines of either form were always meant to be interchangeable within PowerShell.

There was a good amount of Unix influence in the language. Being able to support both forms of newlines was near and dear to many a team members' heart. Being able to support it from the get-go prevented the possibility of making a script that couldn't run just because it was copied to mac or linux and saved in the wrong editor.

The importance of this feature has been proved again and again, and has obviously become mission critical now that PowerShell Core is a thing (because the scenario listed above is far more common). I'd wager that as long as PowerShell is a language, this will be the behavior of PowerShell scripts.

As far as shebang files go, this isn't really an exception to this rule. With a shebang file, Unix is reading the file line by line and then sending it to the interpreter. A carriage return is outside of it's range of expectations, not PowerShell's.

Hope this helps shed some light on things.

Start-Automating
  • 8,067
  • 2
  • 28
  • 47
  • Thank you so much for the insights. I have conflicting thoughts on this. On the one hand, it would feel broken if PowerShell only worked with one type of line ending. On the other hand (and this is somewhat of a paradox), working with either line ending makes PowerShell somewhat _less_ cross-platform, because if it is written using CRLF then it can't be used as a semi-native command script on *nix (because of the inability to make it auto-executable using `#!/usr/bin/env powershell`). Part of the blame lies in the inflexibility of `sh`, but part lies with MS-DOS for going with CRLF eons ago. – Garret Wilson Oct 09 '22 at 16:17
  • I think "canonical" was intended to mean "authoritative", and although your answer is useful, it provides no references. Thanks for the response, but the bounty has expired so I'm deciding to assign it to the answer that at least provided related references and the results of actual experiments. – Garret Wilson Oct 10 '22 at 16:16
  • Good to know the background. Note that the way you describe shebang-line handling isn't quite correct: the system functions that handle executions read up to but excluding the _first LF_ (``\n``) in order to extract the target executable (possibly with parameters) from the shebang line. The target executable is then invoked with the _file path_ of the script at hand passed _as an argument_. But, yes, the problem is that the way the system functions read the shebang line *includes* a CR (``\r``) before the LF, which is why a script whose _first_ line (i.e. the shebang line) ends with CRLF. – mklement0 Oct 10 '22 at 16:48
  • I forgot the word "breaks" at the end of my comment: "which is why a script whose _first_ line (i.e. the shebang line) ends with CRLF _breaks_" (whereas what newline format the rest of the script uses is irrelevant). Unless you disagree with my description of how Unix-like systems handle shebang lines (do tell me, if so), can you please fix the relevant part of your answer, given that it can lead to misconceptions? (Note that if the system were to feed the file line by line to the target interpreter, the name and path of the script file would be unavailable to the interpreter.) – mklement0 Oct 10 '22 at 19:46
  • In the absence of feedback I'm down-voting, for the reasons stated - nothing personal. Always willing to reconsider if a dialogue should lead to a shared understanding that warrants undoing the vote. – mklement0 Oct 10 '22 at 22:10
  • As far as the Shebang behavior, I'm sure @mklement0 is correct, though the distinction between only the first line having a problem and the rest of the lines having a problem is kinda academic (though I do swear the last time I tried this it complained twice, not once). – Start-Automating Oct 10 '22 at 22:26
  • @GarretWilson I said "as canonical as I could be" because I don't have a link to the standards handy that would highlight this exact issue. – Start-Automating Oct 10 '22 at 22:26
  • (Just to be clear) It was I (mklement0) who posted the preceding comments, not @GarretWilson. While it is true that the distinction I'm pointing out isn't important in the realm of _PowerShell_, to my mind it _is_ important not to let an incorrect description stand, even if it is somewhat incidental to the question at hand. After all, someone may think that your description accurately describes shebang-line handling on Unix-like platforms, potentially causing them to form a wrong mental model as a result. – mklement0 Oct 10 '22 at 22:36