2

I have a windows service which must write about 20kB of data in about 20 file over 2 network shares.

Time to write the files with Total Commander: less than 0.1s.

Time to write the files with my application: ca 10s.

What is wrong? Yes, the files are constantly read from both shares, but it should not be an issue since:

public void WriteData(string text, string fileName, bool forceBackup = false) {
    foreach (var dir in Locations) {
        var path = string.Format(@"{0}\{1}", dir, fileName);
        FileStream stream = null;
        try {
            stream = new FileStream(
                path,
                FileMode.Create,
                FileAccess.Write,
                FileShare.Read
            );
            using (StreamWriter writer = new StreamWriter(stream)) {
                stream = null;
                writer.Write(text);
            }
        }
        catch (Exception) { } // irrelevant now, tested it doesn't throw exceptions anyway
        finally {
            if (stream != null) stream.Dispose();
        }
        File.SetLastWriteTime(path, DateTime.Now);
    }
}

The code above works fine with local files. It takes single milliseconds to write all data to RAM drive. My network shares are located on RAM drives too.

What's important - copying those files to exactly the same locations with Total Commander takes single milliseconds too. I mean - copying to network shares under full load.

There is no sharing violation, and the application writes the files in a single thread. There is no problem when writing to those files from Total Commander, there is no problem with writing to those files with my application without using the network share.

There is no sharing violation, because while writing - those files are only READ by web server, with FileAccess.Read and FileShare.ReadWrite explicitly set.

No, I cannot write those files without using network share since what my service does is synchronization between two servers. No, I can't use DFSR, because the files are updated way too often (2 times per second). No, I can't use 2 separate services to update files on both machines because it would cancel the failsafe feature, when each instance of my service can be stopped without the data update on both servers being stopped.

In details, in my production environment there are 2 instances of this service, when one is updating the files, and the other one constantly monitors if the active one does its job. When a failure is detected, they switch their roles. It all happens in real time and works as charm. With one huge glitch: ultra long delay on writing the files.

If you wonder what File.SetLastWriteTime() stands for - it's a workaround for Windows (.NET) bug when file last write time is not correctly updated with create / write alone. And of course the correct modification time is crucial for the other instance to detect, if the first one is updating the files on time.

Also: I get reported that sometimes some garbage is read from those files. It happens very rarely though. I haven't confirmed this bug.

But the main question is - what is taking so long? There is fast, 1GBit link between my test server and target servers, 10GBit link between production servers. Ping below 1ms. It is NOT A NETWORK ISSUE.


After some more testing I've found buffer size and file options do nothing with the write time. I've found network ping is very important. I cannot test the code on my development machine, because the ping is too large.

Anyway - the code could be optimized to run ca 80% faster if all files was created once, then updated without recreating the streams. It's also very fast when used against local share. Anyway - test code is fast, production code on the very same server is 50x slower. There is however a slight difference - the production files are constantly read by a web server, while the test files are not.

Still - goal not reached. I need 20 x 1kb files be updated twice every second on 2 servers linked with 10GBit/s ethernet. Achieved 200ms delay is acceptable, but it works only with test files, with the shared real files I still get over 6000ms per update.

BTW, leaving files open is not an option here, when reliability is critical. The service should be able to switch all updating to an another instance seamlessly, in case any of file would be deleted, or any network, DB or disk error would occur. Leaving files open could lead to sharing violations, memory leaks and other disasters. The correct handling of constantly open files would also be very complicated which would make the code even harder to debug.

Maybe there's another way to share data between servers? Maybe a zip file could be used to upload files, and then another service would unzip the files? I'm positive that zipping and unzipping 1kB of data would not ever take as long as 1 second!

Harry
  • 4,524
  • 4
  • 42
  • 81
  • 1
    While I don't think it would solve the issue, you could make your code *much* shorter by just using `File.WriteAllText(path, text)`. (It's worth *trying* to see if it makes it faster as well, but I wouldn't expect it to.) – Jon Skeet Mar 07 '14 at 10:28
  • Please, do not include information about a language used in a question title unless it wouldn't make sense without it. Tags serve this purpose. – Ondrej Janacek Mar 07 '14 at 10:42
  • @Jon Skeet: I can't, I need the files to be shared for reading. I can't specify file sharing mode with File.WriteAllText, can I? – Harry Mar 07 '14 at 10:54
  • Ah, no. I missed that. Even the `StreamWriter` constructor doesn't allow that option. Using a `using` statement instead of the `try/catch/finally` would still be simpler, mind :) – Jon Skeet Mar 07 '14 at 11:55
  • Create a tiny repro code. That narrows it down. Something like 1) open stream, 2) write a single byte. That's it. – usr Mar 07 '14 at 12:03
  • @usr: I've just done that. Test code runs about 50x faster despite transferring almost twice more data than production code, but the files are not read by web server constantly. Creating test emulating heavy traffic from web clients would take too much time which I don't have now. – Harry Mar 07 '14 at 12:25

2 Answers2

3

I think the answer to your question lies in this post:

Writing to file using StreamWriter much slower than file copy over slow network

To directly answer the question it's writing the file in 4kbyte chunks rather than 64Kbyte chunks which will cause more round-trips.

You should be able to change this see this answer:

https://stackoverflow.com/a/14588922/3323733

Community
  • 1
  • 1
Dave3of5
  • 688
  • 5
  • 23
  • It doesn't work, unfortunately. The files are less than 1KB. I use 20 x 1KB files in my test and the write time does not differ with various buffer size specified. BTW, the network is not a slow one. It's a very fast network, I have 10MBit upload on my test machine, ping less than 5ms. The issue exists even when writing to localhost! The same happens when testing on production servers with 10GBit link between them. Times are almost identical. – Harry Mar 07 '14 at 11:00
  • Ok this answer was specifically about large files. You'd be better writing them to a temp folder and then copying them across to the network share. That way if anything goes wrong you can roll back – Dave3of5 Mar 07 '14 at 13:34
0

I was wrong about it's slow. It is not. The problem was in the test procedure. I've added benchmarking code to my production service and it revealed it works exactly as expected. There was no bug in my code, nor lags in my system.

A chain of events lead to this situation:

  1. A guy called me there is a huge (over 5 seconds) lag with updating of our data.
  2. I ran a debug version of updating service on my development machine to see the lag is almost exactly 5 to 6 seconds.
  3. I created this post and test project which writes test files and measures the time
  4. I tested the code on a production server to see it works fine without big lags
  5. I added benchmarking code to production service to see how it performs in real life (and it performed perfect)
  6. I've called the guy who called me and asked if he sees the lag - he said now there's no lag

What?! Well, it was probably HIS network lag! My fault! I should test the code on production server from the start. When network connection speed and latency is crucial for the process the program cannot be tested in any other environment, or it could be tested, but not for the speed or lags. Probably if I had one bigger file the differences would be less significant, but with 20 small files the difference between my machine and production server is huge.

BTW, the WriteData method is optimal, nothing to change there, tested with all possible file modes and buffer sizes. The only way to speed it up is to keep all files open, but it's not worth the effort. BTW, production server is about 10 times faster also with uploading files to remote FTP. Well, my WAN is a turtle compared to company LAN and even to company WAN. It should be obvious from the start.

Harry
  • 4,524
  • 4
  • 42
  • 81