-1

I have 300 directories/folders, each directory has two columns single file (xxx.gz), I want to merge all files from all folders in a single file. In all files first column is Identifier (ID) which is same.

How to merge all files into single file?

And I want to header for each column as name of file in respective directory.

Directory names are are: (68a7eb0a-123, b5694957-764, etc.. ) and files name are : (a5c403c2, 292c4a2f etc), directory name and respective file name are not same, I want file name as header.

all directories
ls 
6809b1c3-75a5
68e9b641-0cc9
71ae07b8-8bde
b7815cd2-1e69
..
..

each directory contain single file:

cd 6809b1c3-75a5

ls bd21dc2e.txt.gz
mona
  • 101
  • 1
  • 2
  • 12
  • 1
    Please show an example directory structure and file content and the expected final file. – Krzysztof Krasoń Jul 29 '16 at 13:57
  • 1
    [Read in all your files](http://stackoverflow.com/questions/11433432/importing-multiple-csv-files-into-r) then [merge multiple data.frames in a list](http://stackoverflow.com/questions/8091303/simultaneously-merge-multiple-data-frames-in-a-list). This solution should work depending on the file sizes and memory. – zx8754 Jul 29 '16 at 14:05
  • @mona please update your post with additional info using ["edit"](http://stackoverflow.com/posts/38660539/edit). – zx8754 Jul 29 '16 at 14:06

1 Answers1

0

Try this:

for i in * ; do for j in $i/*.gz ; do echo $j >> ../final.txt ; gunzip -c $j >> ../final.txt ; done ; done

Annotated version:

for i in *                       # for each directory under current working directory
  do                             # have nothing else in there
  for j in $i/*.gz               # for each gzipped file under directories
    do 
    echo $j >> ../final.txt      # echo path/file to the final file 
    gunzip -c $j >> ../final.txt # append gunzipping the file to the final file
  done
done

Result:

$ head -8 ../final.txt
6809b1c3-75a5/bd21dc2e.txt.gz
blabla
whatever
you
have
in
those
files
James Brown
  • 36,089
  • 7
  • 43
  • 59