46

Editor's note: Judging by later comments by the OP, the gist of this question is: How can you convert a file with CRLF (Windows-style) line endings to a LF-only (Unix-style) file in PowerShell?

Here is my powershell script:

 $original_file ='C:\Users\abc\Desktop\File\abc.txt'
 (Get-Content $original_file) | Foreach-Object {
 $_ -replace "'", "2"`
-replace '2', '3'`
-replace '1', '7'`
-replace '9', ''`
-replace "`r`n",'`n'
} | Set-Content "C:\Users\abc\Desktop\File\abc.txt" -Force

With this code i am able to replace 2 with 3, 1 with 7 and 9 with an empty string. I am unable to replace the carriage return line feed with just the line feed. But this doesnt work.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Angel_Boy
  • 958
  • 2
  • 7
  • 16

7 Answers7

61

This is a state-of-the-union answer as of Windows PowerShell v5.1 / PowerShell Core v6.2.0:

  • Andrew Savinykh's ill-fated answer, despite being the accepted one, is, as of this writing, fundamentally flawed (I do hope it gets fixed - there's enough information in the comments - and in the edit history - to do so).

  • Ansgar Wiecher's helpful answer works well, but requires direct use of the .NET Framework (and reads the entire file into memory, though that could be changed). Direct use of the .NET Framework is not a problem per se, but is harder to master for novices and hard to remember in general.

  • A future version of PowerShell Core may introduce a
    Convert-TextFile cmdlet with a -LineEnding parameter to allow in-place updating of text files with a specific newline style: see GitHub issue #6201.

In PSv5+, PowerShell-native solutions are now possible, because Set-Content now supports the -NoNewline switch, which prevents undesired appending of a platform-native newline[1] :

# Convert CRLFs to LFs only.
# Note:
#  * (...) around Get-Content ensures that $file is read *in full*
#    up front, so that it is possible to write back the transformed content
#    to the same file.
#  * + "`n" ensures that the file has a *trailing LF*, which Unix platforms
#     expect.
((Get-Content $file) -join "`n") + "`n" | Set-Content -NoNewline $file

The above relies on Get-Content's ability to read a text file that uses any combination of CR-only, CRLF, and LF-only newlines line by line.

Caveats:

  • You need to specify the output encoding to match the input file's in order to recreate it with the same encoding. The command above does NOT specify an output encoding; to do so, use -Encoding;

  • By default, without -Encoding:

    • In Windows PowerShell, you'll get "ANSI" encoding, your system's single-byte, 8-bit legacy encoding, such as Windows-1252 on US-English systems.

    • In PowerShell (Core), v6+, you'll get UTF-8 encoding without a BOM.

    • The input file's content as well as its transformed copy must fit into memory as a whole, which can be problematic with large input files, though is rarely a concern with text files.

    • There's a small risk of file corruption, if the process of writing back to the input file gets interrupted.


[1] In fact, if there are multiple strings to write, -NoNewline also doesn't place a newline between them; in the case at hand, however, this is irrelevant, because only one string is written.

mklement0
  • 382,024
  • 64
  • 607
  • 775
43

You have not specified the version, I'm assuming you are using Powershell v3.

Try this:

$path = "C:\Users\abc\Desktop\File\abc.txt"
(Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force

Editor's note: As mike z points out in the comments, Set-Content appends a trailing CRLF, which is undesired. Verify with: 'hi' > t.txt; (Get-Content -Raw t.txt).Replace("`r`n","`n") | Set-Content t.txt; (Get-Content -Raw t.txt).EndsWith("`r`n"), which yields $True.

Note this loads the whole file in memory, so you might want a different solution if you want to process huge files.

UPDATE

This might work for v2 (sorry nowhere to test):

$in = "C:\Users\abc\Desktop\File\abc.txt"
$out = "C:\Users\abc\Desktop\File\abc-out.txt"
(Get-Content $in) -join "`n" > $out

Editor's note: Note that this solution (now) writes to a different file and is therefore not equivalent to the (still flawed) v3 solution. (A different file is targeted to avoid the pitfall Ansgar Wiechers points out in the comments: using > truncates the target file before execution begins). More importantly, though: this solution too appends a trailing CRLF, which may be undesired. Verify with 'hi' > t.txt; (Get-Content t.txt) -join "`n" > t.NEW.txt; [io.file]::ReadAllText((Convert-Path t.NEW.txt)).endswith("`r`n"), which yields $True.

Same reservation about being loaded to memory though.

Andrew Savinykh
  • 25,351
  • 17
  • 103
  • 158
  • 9
    That will almost work. `Set-Content` will still insert an extra CR/LF at the end. – Mike Zboray Oct 02 '13 at 00:10
  • I see this : $psversiontable.psversion Major Minor Build Revision ----- ----- ----- -------- 2 0 -1 -1 – Angel_Boy Oct 02 '13 at 00:13
  • @Zespri: This is the error message that i get when i execute your script: $path = "C:\Users\abc\Desktop\File\abc.txt" (Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force Get-Content : A parameter cannot be found that matches parameter name 'Raw'. At line:2 char:24 + (Get-Content $path -Raw <<<< ).Replace("`r`n","`n") | Set-Content $path -Force + CategoryInfo : InvalidArgument: (:) [Get-Content], ParameterBindingException + FullyQualifiedErrorId : NamedParameterNotFound,Microsoft.PowerShell.Commands.GetContentCommand – Angel_Boy Oct 02 '13 at 00:14
  • Yep, this is because the example is for powershell v3 and you are using v2. There is no -Raw switch for v2. – Andrew Savinykh Oct 02 '13 at 00:15
  • 4
    Great i updated to powershell v3 and your code worked, but it still leaves CR/LF at the end like mike mentioned. I just want all LF's and no CR/LF's – Angel_Boy Oct 02 '13 at 00:38
  • 1
    Your suggestion for PowerShell v2 will erase the files content, because the redirection will create a new empty file before the subshell can read it. Please remove it. – Ansgar Wiechers Oct 02 '13 at 08:06
  • @AnsgarWiechers is it different in v3? Because in v3 it "works for me". And thank you for your feedback. – Andrew Savinykh Oct 02 '13 at 21:14
  • 1
    The behavior is identical in PowerShell v2 and v3. Using the redirection operator truncates the file before it's read by `Get-Content`. – Ansgar Wiechers Oct 03 '13 at 07:24
  • @AnsgarWiechers, ok, thank you again for spotting this. Initially, I thought along the same lines as you, but then I tested it and it worked. So I assumed that the file is read before the redirection operation kicks in. Apparently I have not tested it properly. Should be fine now. – Andrew Savinykh Oct 03 '13 at 23:37
  • @AW and zespri, big thanks. FYI - AW's answer below failed for me with posh2 but worked with posh3. – timB33 Feb 14 '14 at 14:56
  • 3
    PSv5+ offers a solution to the trailing CRLF problem: `Set-Content -NoNewline`. The truncation of the output file with `>` can be avoided by using `| Out-File …` (or `| Set-Content …`) instead. – mklement0 Dec 21 '16 at 19:12
  • thanks for the `-raw` here I had some text files that were throwing inconsistent content lengths until I added that. saved from hours of grief. – ATek Jan 12 '18 at 21:18
  • @mklement0 I think you got wrong end of the stick. The solution was already fixed according to Ansgar Wiechers' feedback so your comment does not reflect reality. I'm going to roll it back. Please comment here before editing again if you disagree – Andrew Savinykh Feb 21 '18 at 22:58
  • @AndrewSavinykh: My apologies for my (since rolled-back) 2nd editor's note: I had missed that you're now writing to a _different_ file - _that fact is worth mentioning in the answer, though, because it means that your v2 solution is not equivalent_. The 1st (since rolled-back) editor's note still stands, however: _your solution v3 solution is flawed, as demonstrated in the note_. Please edit your answer accordingly - there's all the information you need in the comments. Given the popularity of your answer, not fixing the problem is a disservice to future readers. – mklement0 Feb 21 '18 at 23:24
  • @AndrewSavinykh: I've since realized: your edit to the v2 solution fixed its _data-destroying_ flaw, yet now suffers the same shortcoming as the v3 solution (appending an unwanted CRLF). In the interim, until _you_ get around to fixing your answer, I've reinstated updated versions of my editor's notes as a service to future readers. I do hope you get around to fixing your answer. Do let me know where I'm wrong. (I won't fight you on rolling back my edit again, but I encourage users who read this to check out the edit history). – mklement0 Feb 22 '18 at 02:06
  • @mklement0, sorry, what you are saying does not make sense to me. Text files [should end with a newline](https://stackoverflow.com/questions/729692/why-should-text-files-end-with-a-newline). If they do, then no additional new line is added by the code snippet. If you want something other than the regular situation (that is absence of "CRLF" where it's due), you can ask a different question. You also can provide a different _answer_ if this one feels unsatisfactory to you, although I can't understand why. As far as I can see there is nothing that needs "fixing". – Andrew Savinykh Feb 22 '18 at 04:10
  • @mklement0, same as you I'm open to feedback, so if you want to try again and explain me why your version of truth is better than mine, I'm all ears. - Just remember, that you already made a mistake here that you realized later. Are you sure you are holding your ground because of "service to future reader" and not because it does not feel good admitting that you were mostly wrong? – Andrew Savinykh Feb 22 '18 at 04:11
  • @AndrewSavinykh: To quote from the OP's comment above: "your code worked, but it still leaves CR/LF at the end" (I've since added a note to the question to make its gist more obvious). In other words: a CRLF sequence at the end of the file is _undesired_ - the intent is to convert _all_ CRLF newlines to LF-only newlines - _the output mustn't have any CRLF_. Both your solutions fall short, because they end the output with a CRLF (with an _additional_ one in the v3 solution, if the input had a trailing newline, unlike in the v2 solution, but that's a moot point). – mklement0 Feb 22 '18 at 04:25
  • @AndrewSavinykh: Re "You also can provide a different answer if this one feels unsatisfactory to you": [I have](https://stackoverflow.com/a/48919146/45375), but given your answer's popularity, it may not get noticed, which is why I'm hoping to get your answer fixed. – mklement0 Feb 22 '18 at 04:28
  • @mklement0 would it be acceptable, if we roll back your comments both of the question and of my answer and I'll add the link to your answer in the end saying: "While having a new line at the end of the text file is standard, sometimes it's desirable to avoid it. If you are in this situation please refer to mklement0's answer (link)"? – Andrew Savinykh Feb 22 '18 at 04:41
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/165610/discussion-between-andrew-savinykh-and-mklement0). – Andrew Savinykh Feb 22 '18 at 04:42
  • I know this thread is very old but I had the same issue and couldn't find any other source on the topic. I kind of solved the issue @AndrewSavinykh had with the last CRLF printed by Set-Content, one just has to remove the last 2 lines of the file according to this thread https://stackoverflow.com/questions/11643043/remove-last-line-from-file-with-powershell. This leaves the file with a blank line after the last LF, avoids the usage of .NET, and should work with Powershell above version 2 (if you have version 5 just use -NoNewLine) – luigi Nov 25 '20 at 15:25
30

Alternative solution that won't append a spurious CR-LF:

$original_file ='C:\Users\abc\Desktop\File\abc.txt'
$text = [IO.File]::ReadAllText($original_file) -replace "`r`n", "`n"
[IO.File]::WriteAllText($original_file, $text)
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
  • 2
    Nicely done (works in v2 too). A tip re use of _relative_ paths: Use `(Convert-Path $original_file)` to convert relative paths to full paths first, because the .NET framework's idea of what the current directory is usually differs from PS's. – mklement0 Dec 21 '16 at 19:41
  • What would the replace clause look like if you wanted to switch Unix to Windows, but it was possible that it was _already_ Windows. – Seth Nov 22 '17 at 16:39
  • 3
    @Seth Use a negative lookbehind assertion: ``'(?<!\r)\n', "`r`n"`` (replace LF with CR-LF only if LF is not preceded by CR). – Ansgar Wiechers Nov 22 '17 at 17:17
3

Below is my script for converting all files recursively. You can specify folders or files to exclude.

$excludeFolders = "node_modules|dist|.vs";
$excludeFiles = ".*\.map.*|.*\.zip|.*\.png|.*\.ps1"

Function Dos2Unix {
    [CmdletBinding()]
    Param([Parameter(ValueFromPipeline)] $fileName)

    Write-Host -Nonewline "."

    $fileContents = Get-Content -raw $fileName
    $containsCrLf = $fileContents | %{$_ -match "\r\n"}
    If($containsCrLf -contains $true)
    {
        Write-Host "`r`nCleaing file: $fileName"
        set-content -Nonewline -Encoding utf8 $fileName ($fileContents -replace "`r`n","`n")
    }
}

Get-Childitem -File "." -Recurse |
Where-Object {$_.PSParentPath -notmatch $excludeFolders} |
Where-Object {$_.PSPath -notmatch $excludeFiles} |
foreach { $_.PSPath | Dos2Unix }
GeekyMonkey
  • 12,478
  • 6
  • 33
  • 39
  • 1
    Heads up: This is opinionated to use utf8 as encoding and *not* add a new line at the end. I used this after accidentally pushing an entire project with crlf to VCS which murdered the gradle build. – geisterfurz007 Jan 05 '21 at 18:34
  • Digging a little more, this is caused by Powershell adding the BOM to the start of the file. For ways around that either check [here](https://stackoverflow.com/questions/5596982/using-powershell-to-write-a-file-in-utf-8-without-the-bom) or don't use Powershell to rewrite your filed :') – geisterfurz007 Jan 06 '21 at 08:19
2

Adding another version based on example above by @ricky89 and @mklement0 with few improvements:

Script to process:

  • *.txt files in the current folder
  • replace LF with CRLF (Unix to Windows line-endings)
  • save resulting files to CR-to-CRLF subfolder
  • tested on 100MB+ files, PS v5;

LF-to-CRLF.ps1:

# get current dir
$currentDirectory = Split-Path $MyInvocation.MyCommand.Path -Parent

# create subdir CR-to-CRLF for new files
$outDir = $(Join-Path $currentDirectory "CR-to-CRLF")
New-Item -ItemType Directory -Force -Path $outDir | Out-Null

# get all .txt files
Get-ChildItem $currentDirectory -Force | Where-Object {$_.extension -eq ".txt"} | ForEach-Object {
  $file = New-Object System.IO.StreamReader -Arg $_.FullName
  # Resulting file will be in CR-to-CRLF subdir
  $outstream = [System.IO.StreamWriter] $(Join-Path  $outDir $($_.BaseName + $_.Extension))
  $count = 0 
  # read line by line, replace CR with CRLF in each by saving it with $outstream.WriteLine
  while ($line = $file.ReadLine()) {
        $count += 1
        $outstream.WriteLine($line)
    }
  $file.close()
  $outstream.close()
  Write-Host ("$_`: " + $count + ' lines processed.')
}
Rod
  • 1,443
  • 15
  • 17
1

For CMD one line LF-only:

powershell -NoProfile -command "((Get-Content 'prueba1.txt') -join \"`n\") + \"`n\" | Set-Content -NoNewline 'prueba1.txt'"

so you can create a .bat

santiago
  • 11
  • 2
0

The following will be able to process very large files quickly.

$file = New-Object System.IO.StreamReader -Arg "file1.txt"
$outstream = [System.IO.StreamWriter] "file2.txt"
$count = 0 

while ($line = $file.ReadLine()) {
      $count += 1
      $s = $line -replace "`n", "`r`n"
      $outstream.WriteLine($s)
  }

$file.close()
$outstream.close()

Write-Host ([string] $count + ' lines have been processed.')
mklement0
  • 382,024
  • 64
  • 607
  • 775
ricky89
  • 1,326
  • 6
  • 24
  • 37
  • 4
    On Windows, this works for LF -> CRLF conversion (the opposite of what the OP wanted), but only accidentally so: `System.IO.StreamReader` can also read LF-only files, and `.ReadLine()` returns a line _without_ its original line ending (whether it was LF or CRLF), so the `-replace` operation does nothing. On Windows, `System.IO.StreamReader` appends CRLF when using `.WriteLine()`, so that's how the CRLF line breaks end up in the output file. – mklement0 Dec 21 '16 at 21:16