424

What is the quickest and most pragmatic way to combine all *.txt file in a directory into one large text file?

Currently I'm using windows with cygwin so I have access to BASH.

Windows shell command would be nice too but I doubt there is one.

codeforester
  • 39,467
  • 16
  • 112
  • 140
Yada
  • 30,349
  • 24
  • 103
  • 144

12 Answers12

741

This appends the output to all.txt

cat *.txt >> all.txt

This overwrites all.txt

cat *.txt > all.txt
Robert Greiner
  • 29,049
  • 9
  • 65
  • 85
  • 43
    you may run into a problem where it cats all.txt into all.txt... I have this problem with grep sometimes, not sure if cat has the same behavior. – rmeador Jan 27 '10 at 23:54
  • 13
    @rmeador yes, that is true, if all.txt already exists you will have this problem. This problem is solved by providing the output file with a different extension, or moving all.txt to a different folder. – Robert Greiner Jan 28 '10 at 01:11
  • 6
    cat *.txt >> tmp; mv tmp all.txt (and make sure that all.txt does not exist beforehand) – Renaud Feb 14 '13 at 10:16
  • 19
    I get "Argument list too long" -- guess it can't handle 40,000+ files. – Matt Sep 16 '13 at 15:51
  • 40
    Avoid argument list too long with: `echo *.txt | xargs cat > all.txt` – 5heikki Sep 22 '14 at 08:45
  • You will get some garbage in between if any files doesn't terminate with empty new line. – mootmoot Oct 25 '16 at 10:49
  • Works like a charm thank you. Must make sure though that all input .txt files end with a blank newline. – MitchellK Jul 18 '17 at 12:51
  • 3
    Is there a simple way to add an extra newline between each file? – Max Candocia Mar 30 '18 at 21:11
  • 3
    @MaxCandocia `sed -i -e '$a\' filename.txt` will append one new line to filename.txt. `find . -name "*.txt" -type f -print0 | xargs -0 -n 1 -P 8 sed -i -e '$a\'` will do that for all txt files in the current folder (set P to the number of your logical processors and it will do that concurrently using all your threads). – streamofstars Nov 12 '18 at 21:17
  • 1
    How do we add a demiliter in between the files ? – Naveen Gopalakrishna Aug 20 '21 at 11:52
  • to make sure that an empty amount of input files will create a new file (or better say remove any old file that was in place upfront) you can do a 'echo -n "" > all.txt' befor doing the operation mentioned above. – Alexander Stohr Apr 19 '22 at 09:20
206

Just remember, for all the solutions given so far, the shell decides the order in which the files are concatenated. For Bash, IIRC, that's alphabetical order. If the order is important, you should either name the files appropriately (01file.txt, 02file.txt, etc...) or specify each file in the order you want it concatenated.

$ cat file1 file2 file3 file4 file5 file6 > out.txt
Chinmay Kanchi
  • 62,729
  • 22
  • 87
  • 114
32

The Windows shell command type can do this:

type *.txt > outputfile.txt

Type type command also writes file names to stderr, which are not captured by the > redirect operator (but will show up on the console).

David Wolf
  • 1,400
  • 1
  • 9
  • 18
Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
30

You can use Windows shell copy to concatenate files.

C:\> copy *.txt outputfile

From the help:

To append files, specify a single file for destination, but multiple files for source (using wildcards or file1+file2+file3 format).

Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • This as the IMHO cleanest solution with basically no side effects that beginners could trip over unfortunately does not get appreciated enough :-( – Grmpfhmbl Jun 20 '17 at 06:39
  • OP asked for Bash. – Big Rich May 03 '18 at 11:39
  • 4
    Did you read the question? "Windows shell command would be nice too..." – Carl Norum May 03 '18 at 20:40
  • Worked pretty well, except at the very end of my file I got a weird SUB special unicode character. Deleting it is pretty easy programmatically but not sure why that happened. – abelito Oct 12 '21 at 11:44
19

How about this approach?

find . -type f -name '*.txt' -exec cat {} + >> output.txt
GPrathap
  • 7,336
  • 7
  • 65
  • 83
  • Since OP says the files are in the same directory, you may need to add `-maxdepth 1` to the `find` command. – codeforester Jul 25 '17 at 02:52
  • 3
    Works great with a big number of files, where the accepted reply's approach fails – amine Sep 21 '17 at 12:21
  • ah i wish i knew what this plus and double redirect signify... – hello_earth Mar 27 '20 at 21:24
  • This should be the correct answer. It will work properly in a shell script. Here is a similar method if you want output sorted: `sort -u --output="$OUTPUT_FILE" --files0-from=- < <(find "$DIRECTORY_NAME" -maxdepth 1 -type f -name '*.txt' -print0)` – steveH Apr 27 '20 at 13:41
  • This is a very flexible approach relying on all the strengths of the `find`. My favourite! Surely `cat *.txt > all.txt` does the job within the same directory (as pointed out above). To me, however, becoming comfortably fluent in using `find` has been a very good habit. Today they're all in one folder, tomorrow they have multiple file-endings across nested directory hierarchies. Don't overcomplicate, but also, do make friends with `find`. :) – nJGL Nov 16 '21 at 11:32
19

Be careful, because none of these methods work with a large number of files. Personally, I used this line:

for i in $(ls | grep ".txt");do cat $i >> output.txt;done

EDIT: As someone said in the comments, you can replace $(ls | grep ".txt") with $(ls *.txt)

EDIT: thanks to @gnourf_gnourf expertise, the use of glob is the correct way to iterate over files in a directory. Consequently, blasphemous expressions like $(ls | grep ".txt") must be replaced by *.txt (see the article here).

Good Solution

for i in *.txt;do cat $i >> output.txt;done
jackfizee
  • 309
  • 2
  • 6
  • 2
    Why not `for i in $(ls *.txt);do cat $i >> output.txt;done`? – streamofstars Nov 12 '18 at 21:49
  • 3
    Mandatory [ParsingLs](https://mywiki.wooledge.org/ParsingLs) link, together with a downvote (and you deserve more than one downvote, because `ls | grep` is a seriously bad antipattern). – gniourf_gniourf Jan 25 '19 at 09:23
  • Got an upvote from me because it allows for arbitrary testing/ operations by file name prior to output and it's quick and easy and good for practice. (In my case I wanted: for i in *; do echo -e "\n$i:\n"; cat $1; done ) – Nathan Chappell Feb 07 '19 at 22:09
  • 1
    Wouldn't the `ls *.txt` fail if there are too many files (Argument list too long error)? – Rafael Almeida Mar 21 '19 at 15:25
  • @Mandatory: ls *.txt | grep *.txt | awk '/*.txt/' LOL – runlevel0 Feb 08 '22 at 15:24
7

the most pragmatic way with the shell is the cat command. other ways include,

awk '1' *.txt > all.txt
perl -ne 'print;' *.txt > all.txt
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • 1
    This should be the correct answer for most circumstances. If any text file without an empty new line, using all the above `cat` method will concatenate last line and first line from adjacent files. – mootmoot Oct 25 '16 at 10:56
3
type [source folder]\*.[File extension] > [destination folder]\[file name].[File extension]

For Example:

type C:\*.txt > C:\1\all.txt

That will Take all the txt files in the C:\ Folder and save it in C:\1 Folder by the name of all.txt

Or

type [source folder]\* > [destination folder]\[file name].[File extension]

For Example:

type C:\* > C:\1\all.txt

That will take all the files that are present in the folder and put there Content in C:\1\all.txt

Ori
  • 39
  • 2
1

You can do like this: cat [directory_path]/**/*.[h,m] > test.txt

if you use {} to include the extension of the files you want to find, there is a sequencing problem.

1

The most upvoted answers will fail if the file list is too long.

A more portable solution would be using fd

fd -e txt -d 1 -X awk 1 > combined.txt

-d 1 limits the search to the current directory. If you omit this option then it will recursively find all .txt files from the current directory.
-X (otherwise known as --exec-batch) executes a command (awk 1 in this case) for all the search results at once.

Note, fd is not a "standard" Unix program, so you will likely need to install it

Michael Hall
  • 2,834
  • 1
  • 22
  • 40
0

When you run into a problem where it cats all.txt into all.txt, You can try check all.txt is existing or not, if exists, remove

Like this:

[ -e $"all.txt" ] && rm $"all.txt"

leo
  • 15
  • 6
-5

all of that is nasty....

ls | grep *.txt | while read file; do cat $file >> ./output.txt; done;

easy stuff.

Bibhas Debnath
  • 14,559
  • 17
  • 68
  • 96
kSiR
  • 774
  • 4
  • 9