3

I fetch data from a database and use File.AppendAllText to write each database row into one file totalling 8000 files. This whole process repeats in a loop(of queries) and it seems to be taking forever. Is there a better way to go about doing this? Does it make sense to store all the file handlers?

Aks
  • 5,188
  • 12
  • 60
  • 101
  • 5
    If it is going in a loop, why not just open a file, write N lines, then close it? – leppie Feb 22 '11 at 06:10
  • with each query, I get n lines and each line belongs to a separate file..and there isn't another option. It has to be this way. With the next query, I get n more lines and they distribute into the same n files – Aks Feb 22 '11 at 06:12
  • 1
    There must be a better way, but what it is depends on your objective. What are you actually trying to do? – Sam Feb 22 '11 at 06:17
  • Try to test your query performance. Also, if you can use parallel processing, maybe that will improve the performance. Take a look at parallel.foreach or parallel.for loops. – Radu Caprescu Feb 22 '11 at 06:19
  • I send one query for each day of the year and it returns data for n users for that day. I have to store this into n files(one for each user). This way, data for the entire year gets appended to the user files. I cannot query user by user. Thats a restriction. – Aks Feb 22 '11 at 06:20
  • @leppie: you should explain him what appens behind his back: `File.AppendAllText` opens the file, writes the line and closes the file. The open and close operations are slowing down everything, because they are repeated n times, but an "intelligent" design of the code would open the file once (using a "using (...)"), write all the lines and then let the "using" close it. – xanatos Feb 22 '11 at 06:21
  • Then you are screwed, unless you know you can keep in memory for example 7 days for all the users (you buffer 7 days and write for each user the data, 7 days of user1, 7 days of user2, 7 days of user3...) In this way you'll have only 1/7 of the open/close calls. – xanatos Feb 22 '11 at 06:23
  • Not ideal, but keep the `TextWriter` 's open, just write a line and `Flush` (very important). When the app exits, close all the `TextWriter` 's. If the app dies, it will be closed automatically. – leppie Feb 22 '11 at 06:32

4 Answers4

9

The documentation says "Opens a file, appends the specified string to the file, and then closes the file".
That is you open and close your file for each line.

Use something like this

string[] dbRows = new string[] { "1", "2", "3" };
using (var file = new StreamWriter("file.txt", true))
{
    foreach (var row in dbRows)
    {
        file.WriteLine(row);
    }
}
Albin Sunnanbo
  • 46,430
  • 8
  • 69
  • 108
  • I know this is an old question, but this is a better explanation of what is going on, (and a better solution), – Simon Goodman Dec 09 '16 at 04:24
  • AppendAllText is much faster for me than AppendText (which create StreamWriter) on SSD when I read line by line using http – Geograph Jun 02 '21 at 19:31
4

It may be related to several parameters and in the following cases parallelism won't help:

  • Disk I/O: you will just make the I/O even slower
  • Network capacity: once you reach the network limit, you will just get several files at the same time but overall speed won't change

You could also try to keep the target files open for the duration of the process and close all the files after all the source data is processed. This will avoid the whole open/write/close loop on each user/file.

Johann Blais
  • 9,389
  • 6
  • 45
  • 65
1

When using a explicit filestream you can set the blocksize. Sometimes ago I found out that changing the blocksize some times has a dramatically impact on perfomance.

Perhaps you can also parallelize reading vom SQL Server and writng to disk. For example using a DataReader.

Also have a look at http://blogs.msdn.com/b/blogdoezequiel/archive/2011/02/11/best-practices-on-filestream-implementations.aspx

The like is mainly on SQL Server but also has some hints on FileStreams in Generic

Boas Enkler
  • 12,264
  • 16
  • 69
  • 143
0

I' m working with the same issue... writinng less more often is a lot faster, so it's something to do with byte allocation perhaps more than open/close streamreader.

Check out different frequencies and sizes of the write process...

if you write only strings of 500/1000/10000/100000 length, it's different performance. In an XYZ loop, i have found the fastest performance is writing in Strings of 200-300 characters takes my test about 5 seconds, and writing 10000 characters takes about 30 seconds for an SSD card.

That contradicts the idea that the bottleneck is due to write frequency! C# Performance - Chunking Write of file with AppendAllText

Community
  • 1
  • 1
bandybabboon
  • 2,210
  • 1
  • 23
  • 33