24

I need to concatenate 3 files using C#. A header file, content, and a footer file, but I want to do this as cool as it can be done.

Cool = really small code or really fast (non-assembly code).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Bobby Bruckovnic
  • 999
  • 3
  • 9
  • 16

6 Answers6

42

I support Mehrdad Afshari on his code being exactly same as used in System.IO.Stream.CopyTo. I would still wonder why did he not use that same function instead of rewriting its implementation.

string[] srcFileNames = { "file1.txt", "file2.txt", "file3.txt" };
string destFileName = "destFile.txt";

using (Stream destStream = File.OpenWrite(destFileName))
{
    foreach (string srcFileName in srcFileNames)
    {
        using (Stream srcStream = File.OpenRead(srcFileName))
        {
            srcStream.CopyTo(destStream);
        }
    }
}

According to the disassembler (ILSpy) the default buffer size is 4096. CopyTo function has got an overload, which lets you specify the buffer size in case you are not happy with 4096 bytes.

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
user664769
  • 521
  • 4
  • 6
27
void CopyStream(Stream destination, Stream source) {
   int count;
   byte[] buffer = new byte[BUFFER_SIZE];
   while( (count = source.Read(buffer, 0, buffer.Length)) > 0)
       destination.Write(buffer, 0, count);
}


CopyStream(outputFileStream, fileStream1);
CopyStream(outputFileStream, fileStream2);
CopyStream(outputFileStream, fileStream3);
Mehrdad Afshari
  • 414,610
  • 91
  • 852
  • 789
  • 2
    i don't find this so smart. What is a good BUFFER_SIZE? nobody knows. compared to File.ReadAllText(a) + File.ReadAllText(b) + File.ReadAllText(c), this looks like premature optimization to me. – nes1983 Jan 14 '09 at 19:50
  • 2
    Depends on the size of your files. You would not like to concatenate a few hundred megabytes using the ReadAllText method. – gimpf Jan 14 '09 at 19:54
  • I agree. But in such cases, I would really leave such a task to the experts that wrote a library or, as would cross my thoughs, the cat program. I would certainly not read it chunkwise with some chunk-size a fair dice-roll gave me. – nes1983 Jan 14 '09 at 20:14
  • In fact, this is the traditional method to copy a stream. A good buffer size depends on the specific situation, but probably it has to be a multiple of block size at very least. Remember, file sizes are among the things that easily go beyond 32 bit integer max value ;) – Mehrdad Afshari Jan 14 '09 at 22:56
  • Yea, I've seen this method. but is it smart to pretend low-level in a language like C#? to get this right, you have to consider context switches, library io buffers, etc. Buffer-copying is low-level stuff and should be done on the low level, if you REALLY need it, which isn't the usual case. – nes1983 Jan 15 '09 at 05:51
  • @Nico, Smartness and simplicity are relative. – Mehrdad Afshari Jan 15 '09 at 10:26
  • i got too much into this, hm? i'm sorry, i didn't mean to offend you. – nes1983 Jan 15 '09 at 13:29
  • @Nico, I don't how you interpreted my statement. I meant, the amount of simplicity and smartness depends on the situation. In this special case (using files), the performance of loading the whole files can be extremely critical which cannot always be ignored just because you're coding in C#. – Mehrdad Afshari Jan 15 '09 at 13:46
  • Most cases, Jimmy's way of doing it is fine. If you have large files and need speed, reading from the output of "cat file1 file2 file3" is probably going to be faster than having the buffer on the C# side. – nes1983 Jan 15 '09 at 14:33
  • It might be but it has some other problems. Creating the process, introducing platform dependence, reliance on a third party program and making stuff more complex. You have to make these trade-offs and the "right" solution depends on the specific scenario. – Mehrdad Afshari Jan 15 '09 at 14:41
  • I was thinking all the time why you think it's that much "low-level". I thought it might worth mentioning that FileStreams are internally buffered, so the array here is not doing any actual file buffering to reduce low level syscalls. It's just reducing method calls and it's not that low level. – Mehrdad Afshari Jan 15 '09 at 17:07
  • Yea, but this is kind of my point: using these buffers looks low-level and complicated, but actually it's above the library-buffer, which reads from the OS buffer, which reads from the disc buffer. if i want to use buffers, i do it low-level. if i want to attach files high-level, i do f1+f2+f3. – nes1983 Jan 15 '09 at 20:06
  • 2
    By the way, performance aside, unless you are dealing with text files, `ReadAllText` is definitely not the way to go. You can corrupt the contents of the files while reading them into strings. – Mehrdad Afshari Nov 15 '10 at 00:21
  • A situation where you might want this kind of lower level approach is where you want to feedback progress to the user. – Giles Jul 07 '17 at 14:10
7

If your files are text and not large, there's something to be said for dead-simple, obvious code. I'd use the following.

File.ReadAllText("file1") + File.ReadAllText("file2") + File.ReadAllText("file3");

If your files are large text files and you're on Framework 4.0, you can use File.ReadLines to avoid buffering the entire file.

File.WriteAllLines("out", new[] { "file1", "file2", "file3" }.SelectMany(File.ReadLines));

If your files are binary, See Mehrdad's answer

Community
  • 1
  • 1
Jimmy
  • 89,068
  • 17
  • 119
  • 137
  • 1
    disclaimer: will not work for large files. See Mehrdad Afshari's code for what to do then. – Jimmy Jan 14 '09 at 19:23
  • 2
    This looks like it will need a lot of memory to hold all three files in memory as a string, let alone the intermediate string object resulting by adding the first two and the final string created by adding on the last. – Brian Ensink Jan 14 '09 at 19:26
  • Jimmy you beat me to it with your own disclaimer! :) – Brian Ensink Jan 14 '09 at 19:27
  • 1) Concating strings are not good practise. You should use stringbuilder. 2) This is not good solution for binary files. – TcKs Jan 14 '09 at 19:40
  • 2
    @TcKs: 1) inline concatenation here is done by string.concat(a,b,c) in one operation, with lower overhead than stringbuilder 2) I assumed "header/content/footer" files were text. – Jimmy Jan 14 '09 at 19:57
  • Note this will also rewrite your file as UTF-8 which may not be what is wanted, the above solution from user664769 that uses File.OpenRead operates on the bytes and therefore is a true concatenation of the files without worrying about encoding getting hammered. – aolszowka Oct 08 '18 at 20:46
6

Another way....how about letting the OS do it for you?:

ProcessStartInfo psi = new ProcessStartInfo("cmd.exe", 
        String.Format(@" /c copy {0} + {1} + {2} {3}", 
            file1, file2, file3, dest));
psi.UseShellExecute = false;
Process process = Process.Start(psi);
process.WaitForExit();
Kev
  • 118,037
  • 53
  • 300
  • 385
  • 4
    You let the OS do it no matter what, unless you're running on bare metal. But shelling out it something that shouldn't be done unless desperately needed. – Joey Jan 01 '13 at 20:16
3

You mean 3 text files?? Does the result need to be a file again?

How about something like:

string contents1 = File.ReadAllText(filename1);
string contents2 = File.ReadAllText(filename2);
string contents3 = File.ReadAllText(filename3);

File.WriteAllText(outputFileName, contents1 + contents2 + contents3);

Of course, with a StringBuilder and a bit of extra smarts, you could easily extend that to handle any number of input files :-)

Cheers

Brian Webster
  • 30,033
  • 48
  • 152
  • 225
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 2
    This will force the contents of all the files to be loaded into memory. If you have large files this is extremely inefficient as you will likely force objects into the large object heap. Using a buffer will prevent the need for garbage collection and will almost certainly be more performant. – brianfeucht Mar 30 '16 at 15:23
1

If you are in a Win32 environment, the most efficient solution could be to use the Win32 API function "WriteFile". There is an example in VB 6, but rewriting it in C# is not difficult.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
TcKs
  • 25,849
  • 11
  • 66
  • 104