I have hundreds of txt files, which are all in a single directory. I would like to be able to do the following:
- Join all files in a single txt file. This command will insert a symbol when joining (such as
§
) together with the file name. - [I then do some work on the combined file, which consists of making changes. Some of these changes involve using a priority software which works better with one big file than lots of little files].
- Use a second command to go through the joined file and split it back into separate files, using the file name that was next to the symbol to name each split file.
Example:
Before joining:
File 1: "Towns.txt"
Béthlem
Cabul
Corinthia
ruined lands
eshcol
Gabbatha
old town
File 2: "Fruits and Nuts.txt"
Apples
Pomegranates
Sycamore
After Joining, but before I make changes
(Single file)
§Towns.txt
Béthlem
Cabul
Corinthia
ruined lands
eshcol
Gabbatha
old town
$Fruits and Nuts.txt
Apples
Pomegranates
Sycamore
After Joining and I make changes
(These changes are made manually in the single file)
§Towns.txt
Bethlehem
Cabul
Corinth
Ruined lands
Eshcol
Gabbatha
The Old Town
$Fruits and Nuts.txt
Apples
Pomegranates
Sycamore
After Splitting:
File 1: "Towns.txt"
Bethlehem
Cabul
Corinth
Ruined lands
Eshcol
Gabbatha
The Old Town
File 2: "Fruits and Nuts.txt"
Apples
Pomegranates
Sycamore
Steps I have tried
Combining files
I reworked the answer in this thread, to make an awk
command that can join the files together with the file name prefixed with the § symbol.
awk '(FNR==1){print "§" FILENAME }1' * > ^0join.txt;
This seems to work well.
Splitting files
This thread provides a solution for splitting files. I have reworked to my needs to produce this:
awk -v RS='§' '{ outfile = "output_file_" NR; print > outfile}' ^0join.txt
The only problem is that the output files have the name "outfile1", "outfile2" etc. They also keep the file name at the top of each file, which I do not want. Also, sometimes when I use this command, it will just put everything in a single file called "outfile" and not split them up.
I also found this thread which had another solution, that I reworked:
awk '{print $0 "file" NR}' RS='§' ^0join.txt
However, this didn’t seem to do anything.
Notes
The §
can be any other symbol.
I am using Mac OS 10.14.6, so I would like something that would work in the terminal of Mac OS.