1

I have a Flat file size >1GB. I need to parse the file and insert into DB.

Method 1: Parsing and convert the flatfile to csv. and then use a bulk insert in sql.

Problem:

Using async for file read, I am thrown System.OutOfMemoryException

It takes nearly 30 minutes (depends of the size) to do so.

Do I need to use efficient use of data structures to reduce the time or any other efficient methodologies ?

Yoshi
  • 103
  • 2
  • 12
  • Have you considered using SSIS? Alternatively, in large read/write operations in the past I have used a StreamReader to read the file, parse it and add to a DataTable, then by monitoring the size of this DataTable write to the database in stages, to avoid out of memory exceptions. So read x rows, write, clear the table, continue reading, and so on until the file is complete. This can all be done asynchronously very easily with a BackgroundWorker, or using delegates. – GarethD Jan 23 '13 at 08:40
  • 1
    Just wanted to check - you're not reading the entire file into memory before parsing it, are you? – Matthew Watson Jan 23 '13 at 08:43

2 Answers2

1

Perhaps check the question here. Specifically, if you are using .Net 4.0, it mentions the use of MemoryMappedFile.

However, is there really a need to bring this in via C# if you just want to insert it into a database? Consider doing it with SQL as per the article here.

Community
  • 1
  • 1
Nick
  • 2,285
  • 2
  • 14
  • 26
  • 1
    I wouldn't use the MemoryMappedFile if the data is being read sequentially. – Andrew Barber Jan 23 '13 at 08:31
  • @AndrewBarber agreed. Having reread the question is appears the goal is to simply get the data inserted into the database, which probably requires a different approach altogether :) – Nick Jan 23 '13 at 08:32
  • @Nick: Yeah I could not use MemoryMappedFile. Not enough storage is available to process MemoryMappedViewStream command for the file. – Yoshi Jan 23 '13 at 09:10
  • @Yoshi: Yeah no problem. Is there a reason to pull this into C# at all? Or are you simply looking to get the data into the database? – Nick Jan 23 '13 at 09:27
0

You can use simple FileStream with BinaryReader like this

Boolean isSaving = true, savedOk = false;                  
new Task(new Action(() =>
{
    try
    {
        string fileName = "filename.txt";
        using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
        {
            BinaryReader br = new BinaryReader(fs);
            long size = new FileInfo(fileName).Length;
            int bytesPerRead = (int)(size < 512000 ? size : 512000);
            byte[] past = br.ReadBytes(bytesPerRead);
            int offset = 0;

            while (offset < size)
            {
                //TODO - SAVE YOUR BYTE ARRAY TO DB USING WCF OR LOCAL PROC 

                offset = offset + bytesPerRead;
                bytesPerRead = (int)(size - offset < 512000 ? size - offset : 512000);
                past = br.ReadBytes(bytesPerRead);                                    
            }
            savedOk = true;
       }
    }
    catch (Exception ex) { savedOk = false; }
    isSaving = false;                        

 })).Start();
 while (isSaving) 
 { 
    //DOUPDATES 
 };
Alex
  • 8,827
  • 3
  • 42
  • 58