0

I'm currently trying to find a way to concatenate several files, typically all files from within a directory (recursive included) into a single stream, for further processing. TAR looks like an obvious candidate, except that it is not at all standard in Windows, and unfortunately, all versions i could find (mostly variations of GNU TAR) are much too big (several hundreds of KB once included DLL dependencies). I need something much smaller.

Apparently, the standard COPY command could do the trick. For example the following command works: COPY /B sourcefile1+sourcefile2 destinationfile

However, there are still 2 problems : I don't know how to write the result to stdout (for pipe), and even more importantly how to achieve the reverse operation ?

I need a small utility to do this concatenation job, either in C source code, a standard windows command, or as a distributable binary. It doesn't need to respect the TAR format (although it is not a bad thing if it does). And obviously the concatenation shall be reversible.

Cyan
  • 13,248
  • 8
  • 43
  • 78

3 Answers3

2

I suggest using 7-zip. It has portable version, can compress very good (or just copy without compression) all files recurse subdirectories and write output to single stream (stdout).

It has "-so" (write data to stdout) switch. For example,

7z x archive.gz -so > Doc.txt

decompresses archive.gz archive to output stream and then redirects that stream to Doc.txt file.

7z a -tzip -so -r src\*.cpp src\*.h > archive.zip 

compresses the all *.cpp- and *.h- files in src directory and all it subdirectories to the 7-Zip standard output stream and writes that stream to archive.zip file (remove "> archive.zip" and intercept output by your program).

izogfif
  • 125
  • 1
  • 5
  • 1
    Interesting. The v4.65 of 7z is 150KB, while the v9.20 is 585KB. Quite a difference, so i'll stick to v4.65. Although 150KB alongside a full-fledged compressor looks like a bit of an overkill for a simple concatenation function, this solution seems the best so far. – Cyan Oct 27 '11 at 09:39
  • Mmmh, for some reason, it doesn't work. Using -so to write to standard output always leads to the mention "not implemented". – Cyan Oct 27 '11 at 13:11
  • @Cyan: There's a developer API and library designed for use in your own code, I imagine that'd be smaller yet, and much easier to use than building command lines and capturing standard output from a child process. – Ben Voigt Oct 27 '11 at 15:12
  • @Cyan: A little bit of it, yes. I was building a .NET assembly based on that... got it to build but never had enough time to write proper C#-friendly wrappers. – Ben Voigt Oct 27 '11 at 18:10
  • Unfortunately, although i can download and compile the source code "as is", producing an equivalent binary as the one proposed, there is no way for me to dig into the myriad of source files to find the proper function i need (which is just about concatenation). – Cyan Oct 28 '11 at 14:32
1

Why don't you use ZIP (disable compression if you want)? It's very standard, and support comes built into Windows. See Creating a ZIP file on Windows (XP/2003) in C/C++

Pure concatenation isn't reversible, because you can't know where to split it again. So you should use a directory of chunk sizes, such as exists in the ZIP and TAR formats.

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Yes, indeed. COPY concatenate file content directly, with no other information. So retrieving original files (names, size, etc.) is impossible. – Cyan Oct 27 '11 at 09:40
0

Well, Shelwien's almost solved the issue. The Tar version he proposes is "lean anough" (~120KB) and does not necessitate external DLL dependancies. http://downloads.sourceforge.net/project/unxutils/unxutils/current/UnxUtils.zip

Unfortunately, it also has some problems of its own, such as no support for Unicode characters, interpreted escape sequence (so a directory name starting with t triggers a \t which is considered a tabulation), and a potential problem with pipe implementation under Windows XP (although on this last one it could come from the other program). So that's a dead end.

A solution is still to be found...

[Edit] Shelwien just provided a solution by creating "shar", a tar replacement much smaller and much more efficient, without the limitations described above. This solve the issue.

Cyan
  • 13,248
  • 8
  • 43
  • 78
  • Here is the [link to shar](http://encode.ru/threads/1397-tar-replacement-for-Cyan?p=27066&viewfull=1#post27066) for those who interested. – izogfif Oct 29 '12 at 13:58