1

We have a heavily used .Net 3.5 application that reads "expensive to create" data and caches it. However we are getting a lot of errors around both reading the cache file and writing to the cache file. Some of the advice that I have got from StackOverflow forums is to:

  • a. read the file in "FileShare.Read" mode and write to the file in "FileShare.ReadWrite" mode. (What should the "FileAccess" mode should be used when the system is doing the read\write operation.)
  • b. Use "GC.Collect" after each Read and and write operation.(What are the performance implications of doing this after each read\write operation.)

Is this a correct way of reading and writing files? Please advise.

private XmlDocument ReadFromFile(string siteID, Type StuffType)
{
   XmlDocument result = null;
   var fsPath = FileSystemPath(siteID, StuffType.Name);
   result = new XmlDocument();
   using (var streamReader = new StreamReader(fsPath))
   //using (var fileStream = new FileStream(fsPath, FileMode.Open, FileAccess.Read, FileShare.Read))
   {
      result.Load(streamReader);
   }
   //GC.Collect();                
   return result;
}

private readonly object thisObject = new object();
private void WriteToFile(string siteID, XmlDocument stuff, string fileName)
{
   var fsPath = FileSystemPath(siteID, fileName);
   lock (thisObject)
   {
      //using (var fileStream = new FileStream(fsPath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
      using (var streamWriter = new StreamWriter(fsPath))
      {
         stuff.Save(streamWriter);
      }
      //GC.Collect();
    }
 }
Brian
  • 5,069
  • 7
  • 37
  • 47
Ajit Goel
  • 4,180
  • 7
  • 59
  • 107

3 Answers3

7

If you want synchronize access to resource there are several options, depending on the context. There a several (generic) situations:

  1. Single process, single thread

    No synchronization required.

  2. Single process, multiple threads

    Use a simple locking mechanism like lock or ReaderWriterLockSlim.

  3. Multiple processes, single machine

    Use a (named) Mutex. A Mutex is not very fast. More about performance at the bottom.

  4. Multiple processes, multiple machines

    Now it starts getting interested. .NET does not have an in-the-box solution for this. I can think of two solutions:

    • The try-again-method: Make an while-loop with a try-catch inside. Let him do the resource operation in the try-scope. If it succeeds, then return a successful result. If it fails, wait a few milliseconds and try again... and again... and again.
    • The synchronization master: Make a webservice running on a central location in the network. All process who want access to the resource first have to ask the service for permission. If the resource is "locked" the service will wait, resulting in that the process will wait. As soon as the resource is released, the service is informed and will allow the next process in line to access the resource.

In this case
Of course this last solution is a generic solution. In the case of Ajit Goel it would be as simple as creating a centralized service to read/write files. Then you have one filemaster which is in control of iets files.

Another solution could be to store all your files inside a central database and let him do all the synchronization.

Performance
If performance starts to be an issue, you could try to create a cache.

  1. You could create a cache in memory (but with a lot of files or large files, that could become a memory issue).
  2. You could create a cache in a local folder (one per each process). As soon as the centralized location is modified (just verify the dates), you can copy that file (with a Mutex lock) to your own local folder. Form there your can read the files over and over without a lock with read-access and read-sharing.
Martin Mulder
  • 12,642
  • 3
  • 25
  • 54
  • Unfortunately I cannot do this. We have over 5000 sites and each site has cache data of 100 kb. I will get a lot of resistance from the upper management if I propose that we store so much data in memory. – Ajit Goel Apr 24 '13 at 08:00
  • 1
    In that case I would go with the suggestion of Scott Mermelstein to use the ReaderWriterLock(Slim) (which I upvoted). I still think that combining read/write operations on the same file will fail, so try to solve this problem on a coding-level. – Martin Mulder Apr 24 '13 at 21:25
  • How will this solution work if 2 processes are trying to write to the same file as the same time. It seems the recommended solution is to use Mutexes. Please see here: http://stackoverflow.com/questions/1160233/concurrent-file-write. Don't use neither ;). ReadWriter locks, Monitors (locks) etc. are not capable of inter process locks, which is what author meant I think (Like, if two different processes are writing in the same moment to the file)? – Ajit Goel May 01 '13 at 07:03
  • It all depends on the context. If multiple threads in one process write to the same file, a lock will work. If multiple instances of the same process on the same machine write to the same file, then a Mutex would do the tric. If multiple instances of the same process on different machines write to the same file, then you have quite some work to do. So... to come back at your situation: tell me more about your context? Mutiple threads? Mutiple instances of the same process? Multiple machines? – Martin Mulder May 01 '13 at 08:39
  • We have 2 applications. Application One uses Mutex to lock the file when a process is writing to the file. This application is hosted on 8 web server and the cache files are being written into a single folder location. We are getting above error when writing to the file. Application two uses a "lock" to lock the file when a process is writing to the file. This application is hosted on 8 web server and the cache files are being written in a web server folder. We are getting the above error when both reading and writing to the file. The question that I had asked here was about application two – Ajit Goel May 01 '13 at 10:15
  • Now it starts making sence. Just to be clear, you say "cache files are being written into a single folder location". Do you mean: "one location in the whole network" or "one location per each web server"? – Martin Mulder May 01 '13 at 10:32
  • For Application one, it is one location in the whole network. For Application two, it is one location each web server. – Ajit Goel May 01 '13 at 10:36
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/29286/discussion-between-ajit-goel-and-martin-mulder) – Ajit Goel May 02 '13 at 06:15
  • Thanks Martin so much for your detailed reply. I had a couple of questions with regards to "option four". We have Application One that spawns a thread and uses Mutex to lock and write to the file in a common network folder. This application is hosted on 8 web server. We are getting the above error when writing to the file. In this case I am not sure if "Multiple processes, multiple machines" or "Multiple processes, single machine" is the correct scenario. Can you please advise? – Ajit Goel May 02 '13 at 14:50
  • @Ajit: "8 web servers and ony shared network folder" falls under the 4th option: "Multiple processes, multiple machines"... In your case I would go for option 4, first bullet in combination with performance option 2. This is the easiest to implement. Of course option 4, second bullet is the best, possibly with performance option 2 too. – Martin Mulder May 02 '13 at 15:04
  • When you get a chance could you review my code here?. The fix seems to have made the problem much much worse. http://codereview.stackexchange.com/questions/28856/correct-way-to-read-write-to-a-file-heavily-used-application – Ajit Goel Jul 23 '13 at 03:10
1

I think that the ReaderWriterLock in combination with FileShare.ReadWrite is your solution. (Note, the document page I'm linking you to refers you to a better version called ReaderWriterLockSlim, which should be at least as good.)

You need FileShare.ReadWrite on each thread, so they can all access it however necessary. Any time a thread needs to read, have it AcquireReaderLock (and ReleaseReaderLock when the read is complete).

When you want to write, just use UpgradeToWriterLock, and when you're done, DowngradeFromWriterLock.

This should let all your threads access the file read-only at all times, and let any one thread grab the access to write whenever necessary.

Hope that helps!

Scott Mermelstein
  • 15,174
  • 4
  • 48
  • 76
  • How will this solution work if 2 processes are trying to write to the same file as the same time. It seems the recommended solution is to use Mutexes. Please see here: http://stackoverflow.com/questions/1160233/concurrent-file-write. Don't use neither ;). ReadWriter locks, Monitors (locks) etc. are not capable of inter process locks, which is what author meant I think (Like, if two different processes are writing in the same moment to the file)? – Ajit Goel May 01 '13 at 07:04
  • Sorry, I thought you were talking about simple threads, not inter-process communication. I see in your comments with @MartinMulder that your system is much more complex than that, and ReaderWriterLock doesn't handle that. Martin seems to have listed all of the options available to you. Off the cuff, with that large a system, it sound like you really should be using a database. – Scott Mermelstein May 01 '13 at 14:21
  • When you get a chance could you review my code here?. The fix seems to have made the problem much much worse. http://codereview.stackexchange.com/questions/28856/correct-way-to-read-write-to-a-file-heavily-used-application – Ajit Goel Jul 23 '13 at 03:10
  • @AjitGoel I glanced at your link, but I don't have enough to contribute to your question to bother activating an account on codereview. Having an extra `>` at the end doesn't seem to have anything to do with the code you showed. It should be in how you generate `stuff`. – Scott Mermelstein Jul 23 '13 at 04:55
  • I believe you are right. The problem is with how I generate stuff. I need to use readerWriterLockSlim.EnterReadLock() and readerWriterLockSlim.ExitReadLock() in the function from where the WriteFile function is being called and where the stuff is being created. Funny how a second pair of eyes can help. :) – Ajit Goel Jul 23 '13 at 05:20
0

I believe everyone has to open the file in FileShare.ReadWrite mode.

If someone opens it in FileShare.Read mode, and someone else tries to write to it, it will fail. I don't think they are compatible, because one is saying "share for read only", but the other wants to write. You may need to use FileShare.ReadWrite on all of them, OR minimize the amount of writing to minimize the conflicts.

http://msdn.microsoft.com/en-us/library/system.io.fileshare.aspx

Another option is to use FileShare.Read, and copy the file when making modifications. If there is a single entry point for all modifications, then it can use FileShare.Read to copy the file. Modify the copy, and then update some sort of variable/property that indicates the current file path. Once that is updated, all other processes reading from the file would use that new location. This would allow the modification to occur and complete, and then make all of the readers aware of the new modified file. Again, only viable if you can centralize the modifications. Maybe through some sort of modification request queue if necessary.

AaronLS
  • 37,329
  • 20
  • 143
  • 202