142

I am trying to replicate the functionality of the cat command in Unix.

I would like to avoid solutions where I explicitly read both files into variables, concatenate the variables together, and then write out the concatenated variable.

mklement0
  • 382,024
  • 64
  • 607
  • 775
merlin2011
  • 71,677
  • 44
  • 195
  • 329
  • 1
    Closely related question about merging files via a _copy_ operation: https://stackoverflow.com/q/71209707/45375 – mklement0 Feb 22 '22 at 18:51

11 Answers11

235

Simply use the Get-Content and Set-Content cmdlets:

Get-Content inputFile1.txt, inputFile2.txt | Set-Content joinedFile.txt

You can concatenate more than two files with this style, too.

If the source files are named similarly, you can use wildcards:

Get-Content inputFile*.txt | Set-Content joinedFile.txt

Note 1: PowerShell 5 and older versions allowed this to be done more concisely using the aliases cat and sc for Get-Content and Set-Content respectively. However, these aliases are problematic because cat is a system command in *nix systems, and sc is a system command in Windows systems - therefore using them is not recommended, and in fact sc is no longer even defined as of PowerShell Core (v7). The PowerShell team recommends against using aliases in general.

Note 2: Be careful with wildcards - if you try to output to inputFiles.txt (or similar that matches the pattern), PowerShell will get into an infinite loop! (I just tested this.)

Note 3: Outputting to a file with > does not preserve character encoding! This is why using Set-Content is recommended.

Smi
  • 13,850
  • 9
  • 56
  • 64
  • 7
    Just in case someone wants to iterate over files with the _Get-ChildItems | Foreach-Object_ construct you might want to use Add-Content instead of Set-Content. Otherwise the target file is overwritten in each iteration. – Jonas Feb 06 '18 at 14:53
  • 4
    Note that by default `Set-Content` uses national code page (e.g. Windows-1252 for English). If the source files contain other coding (e.g. Windows-1251 or UTF8), you must set correct encoding `sc file.txt -Encoding UTF8` (numbers such as 1251 for Russian are supported since v6.2) – Radek Pech May 03 '19 at 09:06
  • 1
    @Jonas The problem with `Add-Content` is that if you run the command twice, the aggregated file is twice as long. A good replacement is `Out-File`. Example [here](https://stackoverflow.com/a/57379958/1152054) – Dan Friedman Aug 06 '19 at 16:02
  • 2
    Doesn't seem to work if the files are binary (for example, parts of a zipfile in my case). – Daniel Lidström Mar 18 '20 at 10:13
  • 4
    @DanielLidström It also works for binaries with the correct parameters: `Get-Content my.bin -Raw | Set-Content my.bin -NoNewline` will not alter `my.bin` except the timestamps. `-Raw` preserves any CR/LF bytes, while `-NoNewline` prevents PowerShell from adding its own CR/LF bytes. – stackprotector Jun 23 '20 at 10:36
  • how do we do this in alphabetical order? i want to merge files in a specific order – Alex Gordon Oct 28 '21 at 15:17
  • @Jonas: Instead of using _per-file_ `Add-Content` calls inside a `ForEach-Object` script block, append another pipeline segment that pipes _directly_ to `Set-Content`, which will send all input to a single output file. Not only is this much faster, it avoids the problem of `Add-Content` preserving preexisting content in the output file. – mklement0 Feb 22 '22 at 18:31
  • Smi: _No_ file-writing cmdlets _preserve_ input encoding, because they have no information about it (`Get-Content` reads text into .NET strings without saving information about the original encoding). Thus, in the absence of an explicit `-Encoding` argument, it is a file-writing cmdlet's _own default_ that determines the encoding. `>` is in effect an alias of `Out-File`, which in _Windows PowerShell_ unfortunately defaults to "Unicode" (UTF-16). `Set-Content` defaults to the system's legacy ANSI code page, as @RadekPech notes. In PowerShell _Core_ BOM-less UTF-8 is now the consistent default. – mklement0 Feb 22 '22 at 18:37
  • Also, ending up with a different character encoding is not the only potential problem: the _newline format_ (LF vs. CRLF) may change, and a newline will be appended to the content of files that happen to have no trailing newline. However, you can address this in PowerShell v5+ with a combination of `Get-Content -Raw` and `Set-Content -NoNewLine`. – mklement0 Feb 22 '22 at 18:50
63

Do not use >; it messes up the character encoding. Use:

Get-Content files.* | Set-Content newfile.file
Smi
  • 13,850
  • 9
  • 56
  • 64
Joakim
  • 731
  • 5
  • 2
  • 1
    `cat` is an alias for `Get-Content`. – n0rd Sep 05 '15 at 22:31
  • 5
    @n0rd I think it was more of a "use the pipeline instead" thing. – ksoo Jan 30 '16 at 20:09
  • Can confirm. Was getting `ÿþ` which is `FF FE` at the beginning of my concatenated file when using `>`. – gpresland Aug 23 '19 at 20:38
  • 1
    `>` is an effective alias of `Out-File`, which in Windows PowerShell defaults to "Unicode" (UTF-16LE), whereas `Set-Content` defaults to the system's legacy ANSI code page. While the latter encoding is less problematic, note that _both_ cmdlets have the potential to alter the encoding of the input files, because their default encoding is unrelated to the encoding of the input files (which is information that PowerShell doesn't make available). Note that _PowerShell (Core) 7+_ now fortunately defaults to (BOM-less) UTF-8, consistently across all cmdlets. – mklement0 Feb 22 '22 at 18:57
26

In cmd, you can do this:

copy one.txt+two.txt+three.txt four.txt

In PowerShell this would be:

cmd /c copy one.txt+two.txt+three.txt four.txt

While the PowerShell way would be to use gc, the above will be pretty fast, especially for large files. And it can be used on on non-ASCII files too using the /B switch.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
manojlds
  • 290,304
  • 63
  • 469
  • 417
  • 3
    For me the cat command runs multiple orders of magnitude longer than the cmd /c command (which runs really quick); thanks for pointing out the option! – Rob Aug 13 '14 at 12:23
  • 1
    This is the best answer. – Nicholas DiPiazza May 03 '20 at 20:00
  • 1
    You should add `/b` to the target file to prevent byte 0x1A being added to the end of the file: `copy one.txt+two.txt+three.txt four.txt /b`. See [this Q&A](https://stackoverflow.com/q/9699976/11942268). – stackprotector Apr 21 '22 at 12:38
15

You could use the Add-Content cmdlet. Maybe it is a little faster than the other solutions, because I don't retrieve the content of the first file.

gc .\file2.txt| Add-Content -Path .\file1.txt
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mjsr
  • 7,410
  • 18
  • 57
  • 83
  • To what does `gc` refer? – octopusgrabbus May 08 '18 at 16:40
  • 1
    `gc` is an alias for Get-Content – MM. Oct 20 '18 at 18:01
  • `gc` (`Get-Content`) _does_ retrieve the file content, line by line by default. Use `Set-Content`, not `Add-Content`, because the latter will preserve any preexisting content in the output file. Note the potential to end up with a different character encoding in the output file (irrespective of what cmdlet you use), as discussed in the comments on the accepted answer. – mklement0 Feb 22 '22 at 18:59
10

To concat files in command prompt it would be

type file1.txt file2.txt file3.txt > files.txt

PowerShell converts the type command to Get-Content, which means you will get an error when using the type command in PowerShell because the Get-Content command requires a comma separating the files. The same command in PowerShell would be

Get-Content file1.txt,file2.txt,file3.txt | Set-Content files.txt
Martin Prikryl
  • 188,800
  • 56
  • 490
  • 992
Brian Kimball
  • 372
  • 3
  • 7
5

I used:

Get-Content c:\FileToAppend_*.log | Out-File -FilePath C:\DestinationFile.log 
-Encoding ASCII -Append

This appended fine. I added the ASCII encoding to remove the nul characters Notepad++ was showing without the explicit encoding.

Phoenix14830
  • 360
  • 1
  • 8
  • 24
5

If you need to order the files by specific parameter (e.g. date time):

gci *.log | sort LastWriteTime | % {$(Get-Content $_)} | Set-Content result.log
Roman O
  • 3,172
  • 30
  • 26
4

To keep encoding and line endings:

Get-Content files.* -Raw | Set-Content newfile.file -NoNewline

Note: AFAIR, whose parameters aren't supported by old Powershells (<3? <4?)

Ilyan
  • 175
  • 6
  • I found that adding `-Encoding unicode` to the end of the command you suggest (in addition to your two parameters) allows Excel to open a combination of CSV files correctly. See https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.management/set-content?view=powershell-7.2#parameters for more. – zylstra Feb 08 '22 at 01:08
3

Since most of the other replies often get the formatting wrong (due to the piping), the safest thing to do is as follows:

add-content $YourMasterFile -value (get-content $SomeAdditionalFile)

I know you wanted to avoid reading the content of $SomeAdditionalFile into a variable, but in order to save for example your newline formatting i do not think there is proper way to do it without.

A workaround would be to loop through your $SomeAdditionalFile line by line and piping that into your $YourMasterFile. However this is overly resource intensive.

Kamaradski
  • 312
  • 2
  • 8
3

You can do something like:

get-content input_file1 > output_file
get-content input_file2 >> output_file

Where > is an alias for "out-file", and >> is an alias for "out-file -append".

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
vlad-ardelean
  • 7,480
  • 15
  • 80
  • 124
0

I think the "powershell way" could be :

set-content destination.log -value (get-content c:\FileToAppend_*.log )
dvjz
  • 418
  • 3
  • 8